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Abstract 

In this contribution, the capacity-achieving input covariance matrices for coherent block- 
fading correlated MIMO Rician channels are determined. In contrast with the Rayleigh and 
uncorrected Rician cases, no closed-form expressions for the eigenvectors of the optimum 
input covariance matrix are available. Classically, both the eigenvectors and eigenvalues are 
computed by numerical techniques. As the corresponding optimization algorithms are not very 
attractive, an approximation of the average mutual information is evaluated in this paper in the 
asymptotic regime where the number of transmit and receive antennas converge to +oo at the 
same rate. New results related to the accuracy of the corresponding large system approximation 
are provided. An attractive optimization algorithm of this approximation is proposed and we 
establish that it yields an effective way to compute the capacity achieving covariance matrix 
for the average mutual information. Finally, numerical simulation results show that, even for a 
moderate number of transmit and receive antennas, the new approach provides the same results 
as direct maximization approaches of the average mutual information, while being much more 
computationally attractive. 
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I. Introduction 

Since the seminal work of Telatar [39], the advantage of considering multiple antennas at the 
transmitter and the receiver in terms of capacity, for Gaussian and fast Rayleigh fading single- 
user channels, is well understood. In that paper, the figure of merit chosen for characterizing 
the performance of a coherent ' communication over a fading Multiple Input Multiple Output 
(MIMO) channel is the Ergodic Mutual Information (EMI). This choice will be justified in 
section II-C. Assuming the knowledge of the channel statistics at the transmitter, one important 
issue is then to maximize the EMI with respect to the channel input distribution. Without loss of 
optimality, the search for the optimal input distribution can be restricted to circularly Gaussian 
inputs. The problem then amounts to finding the optimum covariance matrix. 

This optimization problem has been addressed extensively in the case of certain Rayleigh 
channels. In the context of the so-called Kronecker model, it has been shown by various authors 
(see e.g. [15] for a review) that the eigenvectors of the optimal input covariance matrix must 
coincide with the eigenvectors of the transmit correlation matrix. It is therefore sufficient to 
evaluate the eigenvalues of the optimal matrix, a problem which can be solved by using standard 
optimization algorithms. Note that [40] extended this result to more general (non Kronecker) 
Rayleigh channels. 

Rician channels have been comparatively less studied from this point of view. Let us mention 
the work [19] devoted to the case of uncorrelated Rician channels, where the authors proved that 
the eigenvectors of the optimal input covariance matrix are the right-singular vectors of the line 
of sight component of the channel. As in the Rayleigh case, the eigenvalues can then be evaluated 
by standard routines. The case of correlated Rician channels is undoubtedly more complicated 
because the eigenvectors of the optimum matrix have no closed form expressions. Moreover, 
the exact expression of the EMI being complicated (see e.g. [22]), both the eigenvalues and the 
eigenvectors have to be evaluated numerically. In [42], a barrier interior-point method is proposed 
and implemented to directly evaluate the EMI as an expectation. The corresponding algorithms 
are however not very attractive because they rely on computationally-intensive Monte-Carlo 
simulations. 

In this paper, we address the optimization of the input covariance of Rician channels with a 
two-sided (Kronecker) correlation. As the exact expression of the EMI is very complicated, we 
propose to evaluate an approximation of the EMI, valid when the number of transmit and receive 
antennas converge to +oo at the same rate, and then to optimize this asymptotic approximation. 

1. Instantaneous channel state infoiTnation is assumed at the receiver but not necessarily at the transmitter. 
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This will turn out to be a simpler problem. The results of the present contribution have been 
presented in part in the short conference paper [12]. 

The asymptotic approximation of the mutual information has been obtained by various authors 
in the case of MIMO Rayleigh channels, and has shown to be quite reliable even for a 
moderate number of antennas. The general case of a Rician correlated channel has recently 
been established in [17] using large random matrix theory and completes a number of previous 
works among which [9], [41] and [30] (Rayleigh channels), [8] and [31] (Rician uncorrelated 
channels), [10] (Rician receive correlated channel) and [37] (Rician correlated channels). Notice 
that the latest work (together with [30] and [31]) relies on the powerful but non-rigorous replica 
method. It also gives an expression for the variance of the mutual information. We finally 
mention the recent paper [38] in which the authors generalize our approach sketched in [12] 
to the MIMO Rician channel with interference. The optimization algorithm of the large system 
approximant of the EMI proposed in [38] is however different from our proposal. 

In this paper, we rely on the results of [17] in which a closed-form asymptotic approximation 
for the mutual information is provided, and present new results concerning its accuracy. We then 
address the optimization of the large system approximation w.r.t. the input covariance matrix 
and propose a simple iterative maximization algorithm which, in some sense, can be seen as 
a generalization to the Rician case of [44] devoted to the Rayleigh context : Each iteration 
wiU be devoted to solve a system of two nonlinear equations as well as a standard waterfilling 
problem. Among the convergence results that we provide (and in contrast with [44]) : We 
prove that the algorithm converges towards the optimum input covariance matrix as long as 
it converges. We also prove that the matrix which optimizes the large system approximation 
asymptotically achieves the capacity. This result has an important practical range as it asserts 
that the optimization algorithm yields a procedure that asymptotically achieves the true capacity. 
Finally, simulation results confirm the relevance of our approach. 

The paper is organized as follows. Section II is devoted to the presentation of the channel 
model and the underlying assumptions. The asymptotic approximation of the ergodic mutual 
information is given in section III. In section IV, the strict concavity of the asymptotic 
approximation as a function of the covariance matrix of the input signal is established ; it 
is also proved that the resulting optimal argument asymptotically achieves the true capacity. 
The maximization problem of the EMI approximation is studied in section V. Validations, 
interpretations and numerical results are provided in section VI. 
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II. Problem statement 

A. General Notations 

In this paper, the notations s, x, M stand for scalars, vectors and matrices, respectively. As 
usual, ||x|| represents the Euclidian norm of vector x and ||M|| stands for the spectral norm 
of matrix M. The superscripts (.)^ and (.)^ represent respectively the transpose and transpose 
conjugate. The trace of M is denoted by Tr(M). The mathematical expectation operator is 
denoted by E(-) and the symbols and 9 denote respectively the real and imaginary parts 
of a given complex number. If x is a possibly complex- valued random variable, Var(x) = 
E|xp — |E(x)|^ represents the variance of x. 

All along this paper, r and t stand for the number of transmit and receive antennas. Certain 

quantities will be studied in the asymptotic regime t — )• oo, r — )• oo in such a way that 

t 

> c G (0,+oo). In order to simplify the notations, t — )• +00 should be understood from now 

t 

on as t — )• 00, r — )• 00 and )• c G (0, +00). A matrix Mt whose size depends on t is said 

r 

to be uniformly bounded if sup^ \\^t\\ < +00. 

Several variables used throughout this paper depend on various parameters, e.g. the number 
of antennas, the noise level, the covariance matrix of the transmitter, etc. In order to simplify 
the notations, we may not always mention all these dependencies. 

B. Channel model 

We consider a wireless MIMO hnk with t transmit and r receive antennas. In our analysis, the 
channel matrix can possibly vary from symbol vector (or space-time codeword) to symbol vector. 
The channel matrix is assumed to be perfectly known at the receiver whereas the transmitter 
has only access to the statistics of the channel. The received signal can be written as 

y(r) =H(T)x(r)+z(T) (1) 

where x(r) is the t x 1 vector of transmitted symbols at time r, H(r) is the r x f channel 
matrix (stationary and ergodic process) and z(r) is a complex white Gaussian noise distributed 
as iV(0,cj^Ir). For the sake of simplicity, we omit the time index r from our notations. The 
channel input is subject to a power constraint Tr [E(xx^)] < t. Matrix H has the following 

structure : 

H = a/ ^ A + ^^=V, (2) 

S K + i ^ VkTT 

where matrix A is deterministic, V is a random matrix and constant K > is the so-called 
Rician factor which expresses the relative strength of the direct and scattered components of 
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the received signal. Matrix A satisfies iTr(AA ) = 1 while V is given by 

V = ^C^WC^ , (3) 

where W = (Wij) is a r x t matrix whose entries are independent and identically distributed 
(i.i.d.) complex circular Gaussian random variables CA^(0, 1), i.e. Wij = ^Wij + i^Wij where 
^Wij and QWij are independent centered real Gaussian random variables with variance i. The 
matrices C > and C > account for the transmit and receive antenna correlation effects 
respectively and satisfy jTr{C) = 1 and iTr(C) = 1. This correlation structure is often referred 
to as a separable or Kronecker correlation model. 

Remark 1: Note that no extra assumption related to the rank of the deterministic component 
A of the channel is done. Generally, it is often assumed that A has rank one ([15], [27], [18], 
[26], etc..) because of the relatively small path loss exponent of the direct path. Although the 
rank-one assumption is often relevant, it becomes questionable if one wants to address, for 
instance, a multi-user setup and determine the sum-capacity of a cooperative multiple access 
or broadcast channel in the high cooperation regime. Consider for example a macro-diversity 
situation in the downlink : Several base stations interconnected^ through ideal wireline channels 
cooperate to maximize the performance of a given multi-antenna receiver. Here the matrix A 
is likely to have a rank higher than one or even to be of full rank : Assume that the receive 
array of antennas is linear and uniform. Then a typical structure for A is 

A = i=[a(^i),...,a(0t)]A , (4) 

where a{e) = (1, e'^ , e'^^'^^^f and A is a diagonal matrix whose entries represent the 
complex amplitudes of the t line of sight (LOS) components. 



C. Maximum ergodic mutual information 



We denote by C the cone of nonnegative Hermitian t x t matrices and by Ci the subset of 
all matrices Q of S for which -Tr(Q) = 1. Let Q be an element of Si and denote by /(Q) 
the ergodic mutual information (EMI) defined by : 



/(Q) = E. 



H 



loff det Ir + 



4,HQH^ 



(5) 



Maximizing the EMI with respect to the input covariance matrix Q = E(xx^) leads to the 
channel Shannon capacity for fast fading MIMO channels i.e. when the channel vary from 
symbol to symbol. This capacity is achieved by averaging over channel variations over time. 

2. For example in a cellular system the base stations are connected with one another via a radio network controller. 
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We will denote by Ce the maximum value of the EMI over the set Ci : 

Ce = sup /(Q). (6) 

The optimal input covariance matrix thus coincides with the argument of the above maximization 
problem. Note that / : Q i— )• /(Q) is a strictly concave function on the convex set Ci, which 
guarantees the existence of a unique maximum Q* (see [28]). When C = I^, C = 1^, [19] shows 
that the eigenvectors of the optimal input covariance matrix coincide with the right-singular 
vectors of A. By adapting the proof of [19], one can easily check that this result also holds 
when C = It and C and AA^ share a common eigenvector basis. Apart from these two simple 
cases, it seems difficult to find a closed-form expression for the eigenvectors of the optimal 
covariance matrix. Therefore the evaluation of Cg requires the use of numerical techniques 
(see e.g. [42]) which are very demanding since they rely on computationally-intensive Monte- 
Carlo simulations. This problem can be circumvented as the EMI /(Q) can be approximated 
by a simple expression denoted by /(Q) (see section III) as t — )• oo which in turn will be 
optimized with respect to Q (see section V). 

Remark 2: Finding the optimum covariance matrix is useful in practice, in particular if the 
channel input is assumed to be Gaussian. In fact, there exist many practical space-time encoders 
that produce near-Gaussian outputs (these outputs are used as inputs for the linear precoder 
Q^/^). See for instance [34]. 



D. Summary of the main results. 

The main contributions of this paper can be summarized as follows : 

1) We derive an accurate approximation of/(Q) ast— S'+oo : /(Q)~/(Q) where 

7(Q) = log det \lt + G(5(Q, <5(Q))q1 + i(<5(Q), 5~(Q)) 



(V) 



where (5(Q) and 5(Q) are two positive terms defined as the solutions of a system of 2 
equations (see Eq. (33)). The functions G and i depend on ((5(Q), (5(Q)), K, A, C, C, 
and on the noise variance a"^. They are given in closed form. 

The derivation of /(Q) is based on the observation that the eigenvalue distribution of 
random matrix HQH^ becomes close to a deterministic distribution as t — +oo. This 
in particular implies that if (Aj)i<j<r. represent the eigenvalues of HQH^, then : 



1 



log det 



I, + ^HQH^ 



i=l 



1 + 



A,: 
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has the same behaviour as a deterministic term, which turns out to be equal to Taking 
the mathematical expectation w.r.t. the distribution of the channel, and multiplying by r 
gives /(Q) ~ 7(Q). 

The error term /(Q) — /(Q) is shown to be of order 0{j). As I(Q) is known to increase 
Unearly with t, the relative error ^^^](^q-^^^ is of order O(p-). This supports the fact that 
I(Q) is an accurate approximation of /(Q), and that it is relevant to study I(Q) in order 
to obtain some insight on /(Q). 

2) We prove that the function Q i— )• /(Q) is strictly concave on Ci. As a consequence, 
the maximum of I over Ci is reached for a unique matrix Q^. We also show that 
^(Q*) — -^(Q*) = 0{l/t) where we recall that Q* is the capacity achieving covariance 
matrix. Otherwise stated, the computation of (see below) allows one to (asymptotically) 
achieve the capacity /(Q*). 

3) We study the structure of and establish that is solution of the standard waterfilling 
problem : 

max logdet ( I + G((5*, ^*)Q) , 
where (5* = (J(Q^,), = S{Q^) and 

This result provides insights on the structure of the approximating capacity achieving 
covariance matrix, but cannot be used to evaluate since the parameters S^, and 5* depend 
on the optimum matrix Q^. We therefore propose an attractive iterative maximization 
algorithm of 7(Q) where each iteration consists in solving a standard waterfilling problem 
and a 2 X 2 system characterizing the parameters (6,6). 

III. Asymptotic behavior of the ergodic mutual information 

In this section, the input covariance matrix Q S Ci is fixed and the purpose is to evaluate the 
asymptotic behaviour of the ergodic mutual information /(Q) as t — oo (recall that t — +oo 
means t — )• oo, r — )• oo and t/r ^ c e (0, +oo)). 

As we shall see, it is possible to evaluate in closed form an accurate approximation /(Q) of 
/(Q). The corresponding result is partly based on the results of [17] devoted to the study of 
the asymptotic behaviour of the eigenvalue distribution of matrix 515]^ where I] is given by 



S=B+Y , 



(8) 
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matrix B being a deterministic r xt matrix, and Y being a r xt zero mean (possibly complex 
circular Gaussian) random matrix with independent entries whose variance is given by E|yjj p = 
Notice in particular that the variables {Yij; I < i < r, I < j < t) are not necessarily 
identically distributed. We shall refer to the triangular array {afj ; I < i < r, I < j < t) as the 
variance profile of S ; we shall say that it is separable if afj = didj where > for 1 < i < r 
and dj > for 1 < j < t. Due to the unitary invariance of the EMI of Gaussian channels, 
the study of /(Q) will turn out to be equivalent to the study of the EMI of model (8) in the 
complex circular Gaussian case with a separable variance profile. 



A. Study of the EMI of the equivalent model (8). 

We first introduce the resolvent and the Stieltjes transform associated with SS^ (Section 
III-A.l) ; we then introduce auxiliary quantities (Section III-A.2) and their main properties ; we 
finally introduce the approximation of the EMI in this case (Section III-A.3). 

1) The resolvent, the Stieltjes transform: Denote by S(cr^) and S(cj^) the resolvents of 
matrices and defined by : 

S{a^) = [SS^ + a%] , S{a') = [S^S + a%] . (9) 



8(^2) < S(a2) < 4 . (10) 



These resolvents satisfy the obvious, but useful property : 

— , S((j ) < - 
cr^ a 

Recall that the Stieltjes transform of a nonnegative measure fj, is defined by J . The quantity 
s((T^) = iTr(S(c7^)) coincides with the Stieltjes transform of the eigenvalue distribution of 
matrix evaluated at point z = —a^. In fact, denote by (Ai)i<j<r. its eigenvalues , then : 

where v represents the eigenvalue distribution of defined as the probability distribution : 

i=l 

where 6x represents the Dirac distribution at point x. The Stieltjes transform s((T^) is important 
as the characterization of the asymptotic behaviour of the eigenvalue distribution of SS^ is 
equivalent to the study of s(cr^) when t — )• +oo for each cr^. This observation is the starting 
point of the approaches developed by Pastur [29], Girko [13], Bai and Silverstein [1], etc. 

We finally recall that a positive p x p matrix-valued measure /i is a function defined on the 
Borel subsets of M onto the set of all complex-valued p x p matrices satisfying : 
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(i) For each Borel set B, fJ.{B) is a Hermitian nonnegative definite pxp matrix with complex 
entries ; 

(ii) /i(0) = ; 

(iii) For each countable family {Bn)neN of disjoint Borel subsets of R, 

fl{UnBn) = ^tliBn) ■ 

n 

Note that for any nonnegative Hermitian pxp matrix M, then Tr(M/x) is a (scalar) positive 
measure. The matrix-valued measure /i is said to be finite if Tr(/x(M)) < +00. 

2) The auxiliary quantities /3,/3, T and T: We gather in this section many results of [17] 
that will be of help in the sequel. 

Assumption 1: Let (B^) be a family of r x t deterministic matrices such that : 

SUPt,jEj = l l-^ijP < SUPtj Y,l=l l-BjjP < +00 . 

Theorem 1: Recall that S = B + Y and assume that Y = -^D2XD2, where D and D 
represent the diagonal matrices D = diag((ii, 1 < i < r) and D = diag(dj, 1 < j < t) 
respectively, and where X is a matrix whose entries are i.i.d. complex centered with variance 
one. The following facts hold true : 

(i) {Existence and uniqueness of auxiliary quantities) For o"^ fixed, consider the system of 
equations : 

D (^2(1^ + D^) + B{lt + D/3)-iB^ ' 

D {a'^{lt + D/3) + B^(I^ + T>py^B] 

Then, the system (11) admits a unique couple of positive solutions (/3((T^), /3(c7^)). Denote 
by T((T^) and T(o-^) the following matrix-valued functions : 



B = -Tr 
t 

13 = -Tr 

' t 



-1 



(11) 



T(a2) 
T(ct2) 



ct2(I + /3(ct2)D) + B(I + /3(a2)D)-iB^ 
ct2(I + /3(a2)D) + B^(I + Pia^)B)-^B 



(12) 



Matrices T(fj^) and T(o-2) satisfy 



(13) 



(ii) (Representation of the auxiliary quantities) There exist two uniquely defined positive 
matrix-valued measures /x and p, such that /x(M^~) = 1^, = It and 



T(a2) 



fijdX) 



The solutions /^(cr^) and /^(o"^) of system (11) are given by : 



/3(a2) 



-TrDTfo-^ 



/3(a2) = iTrDt(c72) , 



(14) 



(15) 
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and can thus be written as 

where and /ib are nonnegative scalar measures defined by 

1 1 ~ 

fi,,{dX) = -Tr(D/x(dA)) and flb{dX) = -Tr(D/i(dA)). 

(iii) {Asymptotic approximation) Assume that Assumption 1 holds and that 



sup ||D|| < dinax < +00 and sup ||D|| < dmax < +00 • 
t t 

For every deterministic matrices M and M satisfying sup^ ||M|| < +oo and sup^ ||M|| < 
+00, the following limits hold true almost surely : 



limi^+ooiTY[(S(a2)-T(a2))M] = ^^^^ 
S(c72) - T(a2))Ml 







Denote by and fl the (scalar) probability measures /i = jTrfi and /i = jTr/i, by (Aj) 
(resp. (Aj)) the eigenvalues of XIS^ (resp. of S^Xl). The following limits hold true almost 
surely : 

limt^+oo J. EI=i H>'i) - /o^°° </'(A) KdX) =0 

for continuous bounded functions (f) and defined on M+. 

The proof of (i) is provided in Appendix I (note that in [17], the existence and uniqueness 
of solutions to the system (11) is proved in a certain class of analytic functions depending on 
but this does not imply the existence of a unique solution (/3, /3) when cr^ is fixed). The rest 
of the statements of Theorem 1 have been established in [17], and their proof is omitted here. 

Remark 3: As shown in [17], the results in Theorem 1 do not require any Gaussian 
assumption for S. Remark that (17) implies in some sense that the entries of S{a^) and S(cr^) 
have the same behaviour as the entries of the deterministic matrices T(o"^) and T((T^) (which 
can be evaluated by solving the system (11)). In particular, using (17) for IVI = I, it follows that 
the Stieltjes transform s(cj^) of the eigenvalue distribution of SS^ behaves like iTrT(cj^), 
which is itself the Stieltjes transform of measure /i = ^Ttfi. The convergence statement (18) 
which states that the eigenvalue distribution of SS^ (resp. Xl^S) has the same behavior as fi 
(resp. /i) directly follows from this observation. 
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3) The asymptotic approximation of the EMI: Denote by J(o"^) = E log det (l,. + a ^SS^) 
the EMI associated with matrix S. First notice that 



where the Afs stand for the eigenvalues of . Applying (18) to function 4){\) = log(A + cr ) 
(plus some extra work since is not bounded), we obtain : 

^ Urn^ {- log det (l + J - j log(A + a"^) dfi{X)\ = . (19) 



Using the well known relation 



1 



log det I + 



EE 



H 



+ 00 



+00 



1 1 



1 1 



Tr S{u}) I du! 



(20) 



^uj r 

together with the fact that S{uj) w T{uj) (which follows from Theorem 1), it is proved in [17] 
that : 



lim 

t—>-+co 



1 



log det I + 



EE 



H 



1 1 



TrT(w) du 







almost surely. Define by J(ct^) the quantity 



J((j2) = r 



+ 00 



1 1 



TrT{u) du . 



(21) 



(22) 



Then, J(ct^) can be expressed more explicitely as : 



J(cr2) = log det 

or equivalently as 

J(a2) = log det 



1-D_ff 



Ir + /3(a^)D + ^B(It + /3(a^)D)-iB 



log det 



It + /3(a2)D - ahp{a^)^ia^) , (23) 



1 



+ log det I^ + /3(a2)D - a'^tp{a^)~P{a'^) . 

Taking the expectation with respect to the channel S in (21), the EMI J(o'^' 
E log det (I^ + cj-2SI]^) can be approximated by J(o" ) : 

J{a^) = J{a^) + o{t) 



(24) 



(25) 



as t — )• +CX). This result is fully proved in [17] and is of potential interest since the numerical 
evaluation of J(o'^) only requires to solve the 2x2 system (11) while the calculation of J(ct^) 
either rely on Monte-Carlo simulations or on the implementation of rather complicated explicit 
formulas (see for instance [22]). 



12 



In order to evaluate the precision of the asymptotic approximation J, we shall improve (25) 
and get the speed J(o"^) = + 0{t~^) in the next theorem. This result completes those in 

[17] and on the contrary of the rest of Theorem 1 heavily relies on the Gaussian structure of 
XI. We first introduce very mild extra assumptions : 

Assumption 2: Let (B^) be a family of r x t deterministic matrices such that 

sup ||B|| < 5max < +00 . 
t 

Assumption 3: Let D and D be respectively r x r and t x t diagonal matrices such that 



sup ||D|| < dniax < +00 and sup ||D|| < d„ 

t t 



< +00 . 



Assume moreover that 

inf -TrD > and inf -TrD > . 

t t t t 

Theorem 2: Recall that S = B + Y and assume that Y = ^D2XD2, where D = diag(di) 
and D = diag((ij) are r x r and t xt diagonal matrices and where X is a matrix whose entries 
are i.i.d. complex circular Gaussian variables CA^(0, 1). Assume moreover that Assumptions 2 
and 3 hold true. Then, for every deterministic matrices M and M satisfying sup^ ||M|| < +00 
and sup^ ||M|| < +00, the following facts hold true : 



O 



and Var -Tr 

t 



S(cj2)M 



O 



t2 



(26) 



(27) 



(28) 



Var ( -Tr [S{a')M\ , - ^ , 
where Var(.) stands for the variance. Moreover, 

iTY [(E(S(ct2)) - T{a^))M] 
iTr [(E(S(a2)) -t(cj2))M 

and 

J(a^) = JV) + o (^^ 
The proof is given in Appendix II. We provide here some comments. 

Remark 4: The proof of Theorem 2 takes full advantage of the Gaussian structure of matrix 
S and relies on two simple ingredients : 

(i) An integration by parts formula that provides an expression for the expectation of certain 
functionals of Gaussian vectors, already well-known and widely used in Random Matrix 
Theory [25], [32]. 

(ii) An inequality known as Poincare-Nash inequality that bounds the variance of functionals 
of Gaussian vectors. Although well known, its application to random matrices is fairly 
recent ([6], [33], see also [16]). 



13 



Remark 5: Equations (26) also hold in the non Gaussian case and can be established by using 
the so-called REFORM (Resolvent FORmula Martingale) method introduced by Girko ([13]). 

Equations (27) and (28) are specific to the complex Gaussian structure of the channel matrix 
S. In particular, in the non Gaussian case, or in the real Gaussian case, one would get J(o"^) = 
J((T^) + 0(1). These two facts are in accordance with : 

(i) The work of [2] in which a weaker result (o(l) instead of 0(t~^)) is proved in the simpler 
case where B = ; 

(ii) The predictions of the replica method in [30] (resp. [31]) in the case where B = (resp. 
in the case where D = 1^ and D = I,.) ; 

Remark 6 (Standard deviation and bias): Eq. (26) implies that the standard deviation of 
iTr [(S(ct2) - T(cr2))M] and ^Tr (S(cj2) - T(cj2))M are of order 0{t~^) terms. However, 
their mathematical expectations (which correspond to the bias) converge much faster towards 
as (27) shows (the order is 0(t~^)). 

Remark 7: By adapting the techniques developed in the course of the proof of Theorem 2, 
one may estabUsh that u^ES((T^)v — u^T(cr^)v = O (7) , where u and v are uniformly 
bounded r-dimensional vectors. 

Remark 8: Both J(o"^) and J{(J^) increase linearly with t. Equation (28) thus implies that the 
relative error "^^"^j^"/)^"^ is of order 0(t~^). This remarkable convergence rate strongly supports 
the observed fact that approximations of the EMI remain reliable even for small numbers of 
antennas (see also the numerical results in section VI). Note that similar observations have been 
done in other contexts where random matrices are used, see e.g. [3], [30]. 



B. Introduction of the virtual channel HQ 2 

The purpose of this section is to establish a link between the simplified model (8) : S = B+Y 
where Y = -^DaXDa, X being a matrix with i.i.d CA^(0, 1) entries, D and D being diagonal 
matrices, and the Rician model (2) under investigation : H = y/^^A + --^==V where 
V = -^CaWCa. As we shall see, the key point is the unitary invariance of the EMI of 
Gaussian channels together with a well-chosen eingenvalue/eigenvector decomposition. 

We introduce the virtual channel HQ 2 which can be written as : 



where is the deterministic unitary t x t matrix defined by 

= CiQi(Q3CQi)-^ . (30) 
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The virtual channel HQ 2 has thus a structure similar to H, where (A,C,C,W) are 
respectively replaced with (AQ^, C, Q^CQ^, W0). 



Consider now the eigenvalue/eigenvector decompositions of matrices J;— and 



1 ~ 1 



UDU^ and = ufju^^ 



(31) 



VK+1 VK+1 
Matrices U and U are the eigenvectors matrices while D and D are the eigenvalues diagonal 
matrices. It is then clear that the ergodic mutual information of channel HQ 2 coincides with 
the EMI of S = U^HQi/^U. Matrix T, can be written as S = B + Y where 



B 



^ U^AQ^U and Y = ^D^XD^ with X = U^W0U . (32) 



K + 1 ^ 

As matrix W has i.i.d. CA^(0, 1) entries, so has matrix X = U^W0U due to the unitary 
invariance. Note that the entries of Y are independent since D and D are diagonal. We sum 
up the previous discussion in the following proposition. 

Proposition 1: Let W be a r x t matrix whose individual entries are i.i.d. CA^(0, 1) random 
variables. The two ergodic mutual informations 



/(Q) = E log det ( I + \ and J(cj2) = E log det (l + 



<y J \ a 

are equal provided that channel H is given by : 



H= ^/— + ^ - V 

with V = -i=CiWCi ; channel E by S = B + Y with Y = ^DiXDi and that (30), (31) 
and (32) hold true. 

C. Study of the EMI /(Q). 

We now apply the previous results to the study of the EMI of channel H. We first state the 
corresponding result. 

Theorem 3: For Q G Ci, consider the system of equations 

^ ^ , (33) 
S = KS,~5,CI) 

where f{6, 5, Q) and f{6, 6, Q) are given by : 



/(5,d,Q) = ^Tr<jC 



^2 + _J_ c 



K + 1 



(34) 
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f{6, 6, Q) = - Trj QiCQi [a^ {h + Q^CQi ) 



+ 



K 



-1 



Qt I I, + 



K + 1 



K + 1 



AQ-- 



-1 



. (35) 



Then the system of equations (33) has a unique strictly positive solution ((5(Q), 5(Q)). 
Furthermore, assume that sup^ ||Q|| < +oo, sup^ ||A|| < +oo, sup^ ||C|| < +oo, and 
supj ||C|| < +00. Assume also that inf^ Amin(C) > where Amm(C) represents the smallest 
eigenvalue of C. Then, as i — )• +oo, 

1" 



/(Q) = /(Q) + 



where the asymptotic approximation /(Q) is given by 



(36) 



/(Q) = log det I It + QtCQi + ^ -^^^ Qt A^ | + C 



K + 1 



1 K 
^ K + 1 



m) 



-1 



K + 1 



+ logdetfl, + |^C 



K + 1 



5(Q)J(Q) , (37) 



or equivalently by 



/-(Q) = log det ( I. + I^C + ^ ^ AQ^ fl, + 4^ 



K + 1 



K + 1 



K + 1 



-1 



H 



m) 



+ log det ( It + 



Q^A 



-5(Q)5(Q). (38) 



K+1^ " J K+1 
Proof: We rely on the virtual channel introduced in Section III-B and on the eigenva- 
lue/eigenvector decomposition performed there. 

Matrices B, D, D as introduced in Proposition 1 are clearly uniformly bounded, while 
inft jTrD = inf^ jTrC = 1 due to the model specifications and inftiTrQ2CQ2 > 
inft Amm(C) jTrQ > as jTrQ = 1. Therefore, matrices B, D and D clearly satisfy the 
assumptions of Theorems 1 and 2. 

We first apply the results of Theorem 1 to matrix 5], and use the same notations as in the 
statement of Theorem 1. Using the unitary invariance of the trace of a matrix, it is straightforward 
to check that : 

1 \ -r 



VkTT 
VkTT 



-Tr 



D ctM I + D 



-Tr 



D ctM I + D 



6 

'VkTT 



+ B I + D 



+ B^ I + D 



VkTT 

6 

VkTT 



B 



H 



B 
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Therefore, {S,S) is solution of (33) if and only if (-y==, —^==) is solution of (11). As the 

system (11) admits a unique solution, say (/3,/3), the solution (6,6) to (33) exists, is unique 

and is related to (/3, /?) by the relations : 

S S 
f3= ^— — , P= r—— . (39) 

In order to justify (37) and (38), we note that J{a'^) coincides with the EMI /(Q). Moreover, 
the unitary invariance of the determinant of a matrix together with (39) imply that /(Q) defined 
by (37) and (38) coincide with the approximation J given by (23) and (24). This proves (36) 
as well. ■ 
In the following, we denote by Ti^((T^) and T/^((7^) the following matrix-valued functions : 



TA-(a2) 



-1 



(40) 



+ ^C) + ^AQ^(I + ^Q5CQ^)-1qU^ 
+ ^QiCQi) + ^QiA^(I + ^O^iAQi 
They are related to matrices T and T defined by (12) by the relations : 

T^(a2) = UT(a2)U^ 

(41) 

tK(a2) = ut(a2)U^^ 

and their entries represent deterministic approximations of (HQH^ + fj^If)"^ and 
(Q^H-f^HQ^ + o-^I^)-! (in the sense of Theorem 1). 

As ^TrTi^ = iTrT and jTrTi^- = iTrt, the quantities ^TiTk and jTrTi^ are the 
Stieltjes transforms of probability measures ^ and jl introduced in Theorem 1. As matrices 
HQH^ and SS^ (resp. Q^H^HQ^ and S^S) have the same eigenvalues, (18) implies 
that the eigenvalue distribution of HQH^ (resp. QaH^HQa) behaves like jx (resp. jx). 

We finally mention that ^(cr^) and 5{(t'^) are given by 

5(^2) = iTrCTA'(cJ^) and ^(a^) = iTrQ^CQ^/2f^(^2) ^ (42) 
and that the following representations hold true : 

Jr+ a + 0-2 7^+ A + cr^ 

where fi^ and fi^ are positive measures on satisfying /Xfj(M~'") = jTrC and /irf(]R+) = 

IV. Strict concavity of /(Q) and approximation of the capacity /(Q*) 

A. S'fn'cf concavity of /(Q) 

The strict concavity of /(Q) is an important issue for optimization purposes (see Section V). 
The main result of the section is the following : 
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Theorem 4: The function Q i— )• /(Q) is strictly concave on Ci. 

As we shall see, the concavity of / can be established quite easily by relying on the concavity 
of the EMI /(Q) = Elogdet ^1+ ^ . The strict concavity is more demanding and its 

proof is mainly postponed to Appendix III. 

Recall that we denote by Ci the set of nonnegative Hermitian t x t matrices whose normalized 
trace is equal to one (i.e. i^^TrQ = 1). In the sequel, we shall rely on the following 
straightforward but useful result : 

Proposition 2: Let / : Ci — )• M be a real function. Then / is strictly concave if and only if 
for every matrices Qi, Q2 (Qi 7^ Q2) of Ci, the function 0(A) defined on [0, 1] by 

<A(A) = /(AQi + (l-A)Q2) 

is strictly concave. 

1) Concavity of the EMI: We first recall that /(Q) = Elogdet {l + ^^—^ is concave on 
Ci, and provide a proof for the sake of completeness. Denote by Q = AQi + (1 — A)Q2 and 
let 0(A) = /(AQi + (1 — A)Q2). Following Proposition 2, it is sufficient to prove that is 
concave. As logdet {l + "^"" ^ = logdet {l + ^"^^ ^ , we have : 

0(A) = Elogdet(^I + ^^^^ 



0'(A) = ETY(I+^^^)" ^y^(Qx-Q.), 

0"(A) = -ETr 



i + ^^j ^(Qi-Q2)(^i + ^^j ^(Qi-Q2) 

In order to conclude that 0"(A) < 0, we notice that + ^ coincides with 



H^(I + 



HQH^ A H 



Ct2 / f72 



(use the well-known inequality (I + UV)"iU = U(I + VU)"i for U = and V = ). 
We denote by M the non negative matrix 



a2 ; a2 
and remark that 

0"(A) = -ETr [M(Qi - Q2)M(Qi - Q2)] (44) 

or equivalently that 

0"(A) = -ETr [m^/^^q^ _ Q2)m1/2m1/2(q^ _ Qi^)M^n . 

As matrix M^/2(Qi - Q2)Mi/2 Hermitian, this of course impUes that 0"(A) < 0. The 
concavity of and of / are established. 



18 



2) Using an auxiliary channel to establish concavity of Denote by (8) the Kronecker 

product of matrices. We introduce the following matrices : 

A = Im (g) C, A = Im ® C, A = ® A, Q = Im 8) Q • 

Matrix A is of size rm x rm, matrices A and Q are of size tm x tm, and A is of size rm x tm. 
Let us now introduce : 



/„(Q) = Elogdet 1 + 



V = — ^A^WAi and H = a/ — ^A + ^^=V , 

\K + 1 ./kTT 

where W is a rm x tm matrix whose entries are i.i.d SA^(0, 1) -distributed random variables. 
Denote by /m,(Q) the EMI associated with channel H : 

HQH^ 

Applying Theorem 3 to the channel H, we conclude that /^(Q) admits an asymptotic 
approximation /m(Q) defined by the system (34)-(35) and formula (37), where one will 
substitute the quantities related to channel H by those related to channel H, i.e. : 

t mt, r •(-)• mr, AoA, QoQ, Co A, CoA. 

Due to the block-diagonal nature of matrices A, Q, A and A, the system associated with 
channel H is exactly the same as the one associated with channel H. Moreover, a straightforward 
computation yields : 

l7„,(Q) = 7(Q), Vm>l. 
m 

It remains to apply the convergence result (36) to conclude that 

lim -ImiQ) = /(Q) . 

m— >-oo 777, 

Since Q ^ I-miQ) = Imi^m ^ Q) is concave, I is concave as a pointwise limit of concave 
functions. 

3) Uniform strict concavity of the EMI of the auxiliary channel - Strict concavity of I{Q): 
In order to establish the strict concavity of /(Q), we shall rely on the following lemma : 

Lemma 1: Let : [0, 1] — M be a real function such that there exists a family {cl)m)m>i of 
real functions satisfying : 

(i) The functions (j)m are twice differentiable and there exists k < such that 

Vm > 1, VA G [0, 1], C(A) < K < . (45) 

(ii) For every A G [0, 1], (pmW > 0(A). 

Then is a strictly concave real function. 
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Proof of Lemma 1 is postponed to Appendix III. 

Let Qi, Q2 in d ; denote by Q = AQi + (1 - A)Q2, Qi = Im ® Qi, Q2 = Im ^ Q2, 
Q = Im ^ Q- Let H be the matrix associated with the auxiliary channel and denote by : 

( A) = -E log det f I + MSJif. 

We have already proved that (j)m{^) > 0(A) = /(AQi + (1 — A)Q2). In order to fulfill 

assumptions of Lemma 1, it is sufficient to prove that there exists k < such that for every 

AG [0,1], 

limsup(/)'4(A) < K < . (46) 



(46) is proved in the Appendix III. 

B. Approximation of the capacity /(Q*) 

Since I is strictly concave over the compact set Ci, it admits a unique argmax we shall denote 
by Q^, i.e. : 

/(QJ = max/(Q) . 

As we shall see in Section V, matrix can be obtained by a rather simple algorithm. Provided 
that supj HQ* II is bounded, Eq. (36) in Theorem 3 yields /(Q*) — /(Q*) — ;> as t — ;> 00. It 
remains to check that /(Q*) — -/^(Q*) goes asymptotically to zero to be able to approximate 
the capacity. This is the purpose of the next proposition. 

Proposition 3: Assume that sup^ ||A|| < 00, supj ||C|| < 00, sup^ ||C|| < 00, infj Amm(C) > 
0, and inft Amin(C) > 0. Let and Q* be the maximizers over Ci of I and / respectively. 
Then the following facts hold true : 

(i) supt IIQJI < 00. 

(ii) supj HQ* II < 00. 

(iii) 1(a) = /(Q*) + o(t"i). 

Proof: The proof of items (i) and (ii) is postponed to Appendix VI. Let us prove (iii). As 
(/(Q*)-/(Q*)) + (/(Q*)-/(Q*)) 



> > 

(/(Q*)-7(Q*)) + (7(QJ_/(QJ) (47) 

^ V ' ^ V ' 

= o(ri) =o(t"i) 

by (ii) and Th. 3 Eq. (36) by (i) and Th. 3 Eq. (36) 
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where the two terms of the lefthand side are nonnegative due to the fact that Q,, and are the 
maximizers of / and I respectively. As a direct consequence of (47), we have /(Q*) — I(Q^) = 
0{t~^) and the proof is completed. ■ 

V. Optimization of the input covariance matrix 

In the previous section, we have proved that matrix asymptotically achieves the capacity. 
The purpose of this section is to propose an efficient way of maximizing the asymptotic 
approximation /(Q) without using complicated numerical optimization algorithms. In fact, we 
will show that our problem boils down to simple waterfilling algorithms. 

A. Properties of the maximum of 

In this section, we shall establish some of Q^'s properties. We first introduce a few notations. 
Let K, Q) be the function defined by : 

V{K, k, Q) = log det (^I, + ^ QiCQi + ^^(^yy (l. + ^ c) ' AQi 

+ log det I lr + -^ 

or equivalently by 

ViK, Q) = log det 1^1, + + ^^(f^AQt ^I, + ^Q^CQt J Qt ^ 

+ logdet(l, + ^QVW^)-|!^. (49) 
Note that if (5(Q),5(Q)) is the solution of system (33), then : 

/(Q) = mQ),^~(Q),Q) • 

Denote by ((5*, (5*) the solution (5(Q^), 5(Q^)) of (33) associated with Q^. The aim of the 
section is to prove that is the solution of the following standard waterfilling problem : 

I(Q^) = max y((5*,(5*, Q) . 
QeCi 

Denote by G{k,R) the t x t matrix given by : 

G(.,.) = ^C + ^^A»(i,+^c)"a^ (50) 
Then, V{k,, k, Q) also writes 



+ logdet(I. + ^Cl-^. (48) 



K ' 



V{K,k,Q) = logdet{I + QG{K,K)) + logdet{lr + j^C\ - ' (^^^ 
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which readily implies the differentiability of (k, k, Q) i— )• V{k, k, Q) and the strict concavity 
of Q I— )• V{k, k, Q) (k and k being frozen). 

In the sequel, we will denote by VF{x) the derivative of the differentiable function F at 
point X (x taking its values in some finite-dimensional space) and by (VF(x),y) the value of 
this derivative at point y. Sometimes, a function is not differentiable but still admits directional 
derivatives : The directional derivative of a function F at x in direction y is 

^ 40 t 

when the limit exists. Of course, if F is differentiable at x, then F'{x;y) = {'VF{x),y). The 
following proposition captures the main features needed in the sequel. 
Proposition 4: Let F : Ci — >■ M be a concave function. Then : 

(i) The directional derivative F'{Q;P — Q) exists in (—00,00] for all Q,P in Ci. 

(ii) (necessary condition) If F attains its maximum for G Ci, then : 

VQgCi, F'(Q,;Q-QJ <0 . (52) 

(iii) (sufficient condition) Assume that there exists G Ci such that : 

VQgCi, F'(a;Q-Q*) <0. (53) 

Then F admits its maximum at (i.e. is an argmax of F over Ci). 
If F is differentiable then both conditions (52) and (53) write : 

VQgCi, (VF(QJ,Q-Q,) <0. 
Although this is standard material (see for instance [4, Chapter 2]), we provide some elements 
of proof for the reader's convenience. 

Proof: Let us first prove item (i). As Q + t(P - Q) = (1 - t)Q + tP £ 61, A{t) = 
(F(Q + t(P - Q)) - F(Q)) is well-defined. Let < s < t < 1 and consider 

A{t)-A{s) = i|£F((l-t)Q + tP) + ^F(Q)-F((l-.)Q + .P)| , 

§ i{F(.il^ll^ + ^Q)-F((l-.)Q + .P)}, 

= -{F((l-s)Q + sP)-F((l-s)Q + sP)} = 0, 

s 

where (a) follows from the concavity of F. This shows that A{t) increases as t | 0, and in 
particular always admits a limit in (—00,00]. 

Item (ii) readily follows from the fact that F((l - t)Q^ + tP) < F(Q^) due to the mere 
definition of Q^. This implies that A{t) < which in turn yields (52). 
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We now prove (iii). The concavity of F yields : 

As lim^o A(t) < by (53), one gets : VP € Ci, F(P) - F(QJ < 0. Otherwise stated, F 
attains its maximum at and Proposition 4 is proved. ■ 

In the following proposition, we gather various properties related to I. 

Proposition 5: Consider the functions 5(Q),5(Q) and /(Q) from Ci to M. The following 
properties hold true : 

(i) Functions (^(Q),5(Q) and /(Q) are differentiable (and in particular continuous) over Ci. 

(ii) Recall that is the argmax of / over d, i.e. VQ G Si, /(Q) < /(Q*) . Let Q G Ci. 
The following property : 

VPgCi, (V/(Q),P-Q) <0 

holds true if and only if Q = Q^. 

(iii) Denote by 5* and ^* the quantities (^(Q^) and ^(Q^,). Matrix is the solution of 
the standard waterfilling problem : Maximize over Q G Ci the function 1^(5*, 5*, Q) or 
equivalently the function logdet(I + QG((5*, 5*)). 

Proof: (i) is established in the Appendix. Let us establish (ii). Recall that /(Q) is strictly 
concave by Theorem 4 (and therefore its maximum is attained at at most one point). On the 
other hand, /(Q) is continuous by (i) over Ci which is compact. Therefore, the maximum of 
/(Q) is uniquely attained at a point Q^. Item (ii) follows then from Proposition 4. 
Proof of item (iii) is based on the following identity, to be proved below : 

(V/(QJ,Q-Q,) = (VqF (5.,5;,Q,) ,Q-Q,) , (54) 

where Vq denote the derivative of V{k, k, Q) with respect to F's third component, i.e. 
VqF(k;, K,Q) = Vr(Q) with T : Q i-> k,Q). Assume that (54) holds true. Then item 
(ii) implies that (Vq V {5^,K,Q,^ ,Q - Q*) < for every Q G Ci. As Q F((5*,^*,Q) 
is strictly concave on Ci, is the argmax of V{5^, 5^, ■) by Proposition 4 and we are done. 
It remains to prove (54). Consider Q and P in Si, and use the identity 



(V/(P), Q - P) = (VQy(<5(P), 5», P), Q - P)) 

'dV_ 



+ (^)(5(P),5(P),P) (V^(P),Q 



+ (|^) WP),W,P) (V5(P),Q-P) 
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We now compute the partial derivatives of V and obtain : 

r dV ta^ / ~, 

(55) 



dK K + l\ " ' ' ' 

where / and / are defined by (34) and (35). The first relation follows from (48) and the second 
relation from (49). As (5(Q),(5(Q)) is the solution of system (33), equations (55) imply that : 

|^(5(Q),<5(Q),Q) = |^(<5(Q),<5(Q),Q) = . (56) 

Letting P = and taking into account (56) yields : 

(V/(Q,),Q-Q,) = (VQy(<^(QJ,5(Q,),Q,),Q-a> , 

and (iii) is established. ■ 
Remark 9: The quantities 5^ and 5^ depend on matrix Q^. Therefore, Proposition 5 does not 
provide by itself any optimization algorithm. However, it gives valuable insights on the structure 
of Q^. Consider first the case C = I and C = I. Then, G(5*,5*) is a linear combination of 
I and matrix A^A. The eigenvectors of thus coincide with the right singular vectors of 
matrix A, a result consistent with the work [19] devoted to the maximization of the EMI /(Q). 
If C = I and C / I, G(5*,(5*) can be interpreted as a linear combination of matrices C 
and A^A. Therefore, if the transmit antennas are correlated, the eigenvectors of the optimum 
matrix coincide with the eigenvectors of some weighted sum of C and A^A. This result 
provides a simple explanation of the impact of correlated transmit antennas on the structure 
of the optimal input covariance matrix. The impact of correlated receive antennas on is 
however less intuitive because matrix A^A has to be replaced with A^(I + 5*C)^^A. 

B. The optimization algorithm. 

We are now in position to introduce our maximization algorithm of /. It is mainly motivated 
by the simple observation that for each fixed {k,k), the maximization w.rt. Q of function 
K,Q) defined by (51) can be achieved by a standard waterfilling procedure, which, of 
course, does not need the use of numerical techniques. On the other hand, for Q fixed, the 
equations (33) have unique solutions that, in practice, can be obtained using a standard fixed- 
point algorithm. Our algorithm thus consists in adapting parameters Q and 6, 5 separately by 
the following iterative scheme : 

- Initialization : Qo = I, {di, 6i) are defined as the unique solutions of system (33) in which 
Q = Qo = I- Then, define Qi are the maximum of function Q — V{6i,6i,Q) on Ci, 
which is obtained through a standard waterfilling procedure. 



24 



- Iteration k : assume Qfc_i, available. Then, {5k, 5k) is defined as the unique 
solution of (33) in which Q = Qfc-i. Then, define Qfc are the maximum of function 
Q^y(<5fc,4,Q) on Ci. 

One can notice that this algorithm is the generalization of the procedure used by [44] for 
optimizing the input covariance matrix for correlated Rayleigh MIMO channels. 

We now study the convergence properties of this algorithm, and state a result which implies 
that, if the algorithm converges, then it converges to the unique argmax of I. 
Proposition 6: Assume that the two sequences {6k)k>a and {5k)k>o verify 

lim 5k - 4-1 0, lim 5k - h-i (57) 

k—^+oo fe— >+oo 

Then, the sequence {Qk)k>o converges toward the maximum of I on Ci. 
The proof is given in the appendix. 

Remark 10: If the algorithm is convergent, i.e. if sequence {Qk)k>o converges towards a 
matrix P,,, Proposition 6 implies that = Q*. In fact, functions Q i-^ 5{Q) and Q i-)- 5{Q) 
are continuous by Proposition 5. As 5k = 6{Qk-~i) and 4 = 5(Qa,-i), the convergence of (Q^) 
thus implies the convergence of {5k) and {5k), and (57) is fulfilled. Proposition 6 immediately 
yields P* = Q^,. Although we have not been able to prove the convergence of the algorithm, 
the above result is encouraging, and tends to indicate the algorithm is reliable. In particular, all 
the numerical experiments we have conducted indicates that the algorithm converges towards a 
certain matrix which must coincide by Proposition 6 with Q^. 

VI. Numerical experiments. 

A. When is the number of antennas large enough to reach the asymptotic regime ? 

All our analysis is based on the approximation of the ergodic mutual information. This 
approximation consists in assuming the channel matrix to be large. Here we provide typical 
simulation results showing that the asymptotic regime is reached for relatively small number of 
antennas. For the simulations provided here we assume : 

- Q = It- 

- The chosen line-of-sight (LOS) component A is based on equation (4). The angle of arrivals 
are chosen randomly according to a uniform distribution. 

- Antenna correlation is assumed to decrease exponentially with the inter-antenna distance 
i.e. Cij ~ p!^"-'', dj ~ /o'^"^' with < pT < 1 and < p/?, < 1. 

- K is equal to 1. 



25 



Figure 1 represents the EMI /(Q) evaluated by Monte Carlo simulations and its approximation 
I(Q) as well as their relative difference (in percentage). Here, the correlation coefficients are 
equal to {pt,Pr) = (0.8,0.3) and three different pairs of numbers of antenna are considered : 
{t,r) G {(2,2), (4,4), (8,8)}. Figure 1 shows that the approximation is reliable even for r = 
t = 2 in a wide range of SNR. 
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Montecarlo Simulations ( 2*2 ) 

* Deterministic Approximant ( 2*2 ; 

IVIontecarlo Simulations ( 4*4 ) 

X Deterministic Approximant ( 4*4 ; 

Montecarlo Simulations ( 8*8 ) 

< Deterministic Approximant ( 8*8 ] 





-5 5 10 

SNR in dB 

Fig. 1. The large system approximation is accurate for correlated Rician MIMO channels. The relative difference 
between the EMI approximation and that obtained by Monte-Carlo simulations is less than 5 % for a 2 x 2 system 
and less than 1 % for a 8 x 8 system. 



B. Comparison with the Vu-Paulraj method. 

In this paragraph, we compare our algorithm with the method presented in [42] based 
on the maximization of /(Q). We recall that Vu-Paulraj's algorithm is based on a Newton 
method and a barrier interior point method. Moreover, the average mutual informations and 
their first and second derivatives are evaluated by Monte-Carlo simulations. In fig. 3, we have 
evaluated Ce = niaxQgg^ -^(Q) versus the SNR for r = t = 4. Matrix H coincides with 
the example considered in [42]. The solid line corresponds to the results provided by the 
Vu-Paukaj's algorithm; the number of trials used to evaluate the mutual informations and 
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n = N = 2 


n = N = 4 


n = N = 8 


Vu-Paulraj 


0.75 


8.2 


138 


New algorithm 


10^2 


3.10-2 


7.10-2 



Fig. 2. Average time per iteration in seconds 

its first and second derivatives is equal to 30.000, and the maximum number of iterations 
of the algorithm in [42] is fixed to 10. The dashed line corresponds to the results provided 
by our algorithm : Each point represents /(Q*) at the corresponding SNR, where is 
the argmax of /; the average mutual information at point is evaluted by Monte-Carlo 
simulation (30.000 trials are used). The number of iterations is also limited to 10. Figure 3 
shows that our asymptotic approach provides the same results than the Vu-Paulraj 's algorithm. 
However, our algorithm is computationally much more efficient as the above table shows. 
The table gives the average executation time (in sec.) of one iteration for both algorithms for 
r = t = 2,r = t = 4,r = t = 8. 

In fig. 4, we again compare Vu-Paukaj's algorithm and our proposal. Matrix A is generated 
according to (4), the angles being chosen at random. The transmit and receive antennas 
correlations are exponential with parameter < pr < I and < pR < I respectively. 
In the experiments, r = t = 4, while various values of pr, Pr and of the Rice factor K 
have been considered. As in the previous experiment, the maximum number of iterations for 
both algorithms is 10, while the number of trials generated to evaluate the average mutual 
informations and their derivatives is equal to 30.000. Our approach again provides the same 
results than Vu-Paulraj's algorithm, except for low SNRs for K = 1, px = 0.5, pR = 0.8 where 
our method gives better results : at these points, the Vu-Paulraj's algorithm seems not to have 
converge at the 10th iteration. 

VII. Conclusions 

In this paper, an explicit approximation for the ergodic mutual information for Rician MIMO 
channels with transmit and receive antenna correlation is provided. This approximation is based 
on the asymptotic Random Matrix Theory. The accuracy of the approximation has been studied 
both analytically and numerically. It has been shown to be very accurate even for small MIMO 
systems : The relative error is less than 5% for a 2 x 2 MIMO channel and less 1 % for an 
8x8 MIMO channel. 
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Fig. 3. Comparison witli tlie Vu-Paulraj algorithim I 



- K=0.1, p=0.98, p=0.9 

- K=0.1, p=0.8, p=0.5 

- K=1, p=0.8, p=0.5 




5 

SNR (dB) 



15 



Fig. 4. Comparison witli tlie Vu-Paulraj algoritlim II 

The derived expression for the EMI has been exploited to derive an efficient optimization 
algorithm providing the optimum covariance matrix. 



Appendix I 

Proof of the existence and uniqueness of the system (11). 
We consider functions g(K, R) and ^(k, k) defined by 



1 1 

c,(k, k) = - -Tr 



q(K, k) = Tr 



i-oH 



D a^{Ir + Dk) + B(It + Dk)"^B 



D (a'^ilt + Dk) + B^{Ir + Dk)~^B 



-1 



(58) 
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For each k > fixed, function k — )• g{K, k) is clearly strictly decreasing, converges toward +00 
if K — )• and converges to if k — )■ +00. Therefore, there exists a unique k > satisfying 
g{K, k) = 1. As this solution depends on k, it is denoted in the following. We claim that 

- (i) Function k — h{k) is strictly decreasing, 

- (ii) Function k — )• kh{k) is strictly increasing. 

In fact, consider K2 > ki. It is easily checked that for each k > 0, g{K,ki) > g{K,k2). 
Hence, the solution h{ki) and h{k2) of the equations g{K,ki) = 1 and g{K,k2) = 1 satisfy 
h{ki) > h{k2). This establishes (i). To prove (ii), we use the obvious relation g{h{ki),ki) — 
(7(/i(k2), K2) = 0. We denote by (Uj)j=i^2 the matrices 

Ui = {h(ki)\ + kih{ki)Ty) + B {-^^ + B^ 

It is clear that g{h{ki), ki) = jTrDU^^^. We express g{h{ki), ki) — g{h{k2), K2) as 

g{h{ki),ki) - 9{h{k2), k2) = ^T\^D(Ur' - V^^) 

and use the identity 

ur' - U2 ' = ur' (U2 - Ui) U2 1 . (59) 

Using the form of matrices (Uj)j=i^2> we eventually obtain that 

g{h{ki),ki) - g{h{k2), K2) = u{h{k2) - hiki)) + ^(^2/1(^2) - kih{ki)) , 
where u and v are the strictly positive terms defined by 

u = ^TrDU^^ [aH + B(I + h{k2)t>)~^{l + /i(ki)D)"^B^) U2 ^ 

and 

t ^ ^ 

As u{h{k2) — h{ki)) + v{k2h{k2) — kih{ki)) = 0, {h{k2) — h{ki)) < implies that 
k2h{k2) — kih{ki) > 0. Hence, kh{k) is a strictly increasing function as expected. 

From this, it follows that function k — )• g{h{k),k) is strictly decreasing. This function 
converges to +00 if k — )• and to if k — )• +00. Therefore, the equation 

k — )• g{h{k), k) = 1 

has a unique strictly positive solution /3. If /3 = /i(/3), it is clear that g{l3, /3) = 1 and g{f3, /3) = 1. 
Therefore, we have shown that f3) is the unique solution of (11) satisfying /3 > and /3 > 0. 
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Appendix II 
Proof of Theorem 2 

This section is organized as follows. We first recall in subsection II-A some useful 
mathematical tools. In subsection II-B, we establish (26). In II-C, we prove (27) and (28). 



We shall use the following notations. If n is a random variable, the zero mean random variable 
u — E(ti) is denoted hy u. If z = x + iy is a complex number, the differential operators and 



^ are defined respectively by ^ — 
matrices, we denote respectively by ^j,hj,yj their columns. 



and 



1 / d 



: + 1 



Finally, if X1,B, Y are given 



A. Mathematical tools. 

1) The Poincare-Nash inequality: (see e.g. [7], [21]). Let x = [xi, . . . ^xmY be a complex 
Gaussian random vector whose law is given by E[x] = 0, E[xx-^] = 0, and E[xx*] = H. 
Let $ = . . . , xm, x\, . . . , xm) be a complex function polynomially bounded together 

with its partial derivatives. Then the following inequality holds true : 



Var($(x)) < E V^^>(x)^ H V^^>(x) 



+ E 



where = [d^/dzi, . . .^d^/dzuV and = [d^/dzi, . . .^d^/dzuV- 

Let Y be the r x t matrix Y = -^DaXDs , where X has i.i.d. CA^(0, 1) entries and consider 



the stacked rt xl vector x = [Fn, . . . ,Yrt]^ . In this case, Poincare-Nash inequality writes 



r t 



Var($(Y))<ij;^d4E 



(9$(Y) 



+ 



(60) 



2) The differentiation formula for functions of Gaussian random vectors: With x and $ given 
as above, we have the following 

^ ^a$(x) 



E[x,^k)] = Y,[S]^^ 



E 



m=l 



dXr, 



(61) 



This formula relies on an integration by parts, and is thus referred to as the Integration by 
parts formula for Gaussian vectors. It is widely used in Mathematical Physics ([14]) and has 
been used in Random Matrix Theory in [25] and [32]. 

If X coincides with the rt x 1 vector x = [Yu, . . . , Yrt]"^, relation (61) becomes 



E[yp,c^(Y)] = ^E 
Replacing matrix Y by matrix Y also provides 

E \Y,,HY)] = ^E 



(9$(Y) 



dY 



pq 



dYr 



pq J 



(62) 



(63) 
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3) Some useful differentiation formulas: The following partial derivatives 



dis^ and ^ 



for each g G {1, . . . , r} and I < i < r,l < j < t will be of use in the sequel. Straightforward 
computations yield : 

dSpq Q . fdHc\ 

'&Y~ - "^P.* l^i 

dSpq Q fc<i\ 



(64) 



B. Proof of (26) 

We just prove that the variance of iTr(MS) is a 0(t~^) term. For this, we note that the 
random variable iTr(MS) can be interpreted as a function <I>(Y) of the entries of matrix Y, 
and use the Poincare-Nash inequality (60) to <J'(Y). Function 'I>(Y) is equal to 



Therefore, the partial derivative of ^(Y) with respect to Yij is given by ^g^,^ 
I Ep,g Mg,p^ which, by (64), coincides with 



As di < dmax and dj < (imax> it is clear that 



i=i j=i 



dYi 



1=1 j=l 



It is easily seen that 



i=l 



(9$(Y) 



As ||S|| < 4y and sup^ ||M|| < oo, SMS^M^S^f is less than ^ sup^ ||Mf H^^-f. Mo- 
reover, E||^j|p coincides with ||bj|p + jdj Yll=i ^i' which is itself less than b'^^^ + dma.xd, 



r 

max ^ > 



a uniformly bounded term. Therefore, J2i=i ^ 



d'S>{Y) 



is a 0(t ^) term. This proves that 



r t 



i=i j=i 



a$(Y) 



O 



t2 



It can be shown similarly that t ^ X]i=i Yl]=i ^i^j 
follows from Poincare-Nash inequality (60). 



E 



a$(Y) 

dY~ 



O (t 2) . The conclusion 
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C. Proof of (27) and (28). 

As we shall see, proofs of (27) and (28) are demanding. We first introduce the following 
notations : Define scalar parameters r]{a^),a{a'^),d{a'^) as 

rjia') = iTr(DS(a2)) 

E[iTr(DS(a2))] (65) 



a(a2) 
d(a2) 



iTr (DS(a2 



cj2 (I + dD) + B I + aD 



and matrices R((j^), R(cr^) as 

K{a') = 
R(a2) = 

We note that, as a (cj2) > and d(cr2) > 0, then 



B 



H 



( I + aD ) + B^ (I + dD)"^ B 



(66) 



< R((t2) < II, < R((t2) < ^ 



(67) 



It is difficult to study directly the term iTrM(E(S) — T). In some sense, matrix R can be seen 
as an intermediate quantity between E(S) and T. Thus the proof consists into two steps : 1) for 
each uniformly bounded matrix M, we first prove that iTrM(E(S) - R) and iTrM(R - T) 
converge to as t — )• oo ; 2) we then refine the previous result and establish in fact that 
iTrM(E(S) - R) and iTrM(R- T) are 0{t-'^) terms. This, of course, imply (27). Eq. (28) 
eventually follows from Eq. (27), the integral representation 



Tr(E(S(L^)) -T(lj)) dw, 



(68) 



which follows from (20) and (22), as well as a dominated convergence argument that is omitted. 

1) First step : Convergence of iTrM(E(S) - R) and iTrM(R- T) to zero: The first step 
consists in showing the following Proposition. 

Proposition 7: For each deterministic r x r matrix M, uniformly bounded (for the spectral 
norm) as t — oo, we have : 



lim -Tr[M(E(S) 

t-^+oo t 

1 



R) 







lim -Tr[M(R) 

t^+oo t 



(69) 
(70) 



T)] = 

Proof: We first prove (69). For this, we state the following useful Lemma. 
Lemma 2: Let P, Pi and P2 be deterministic r xt,txt,txr matrices respectively, uniformly 
bounded with respect to the spectral norm as t — >• 00. Consider the following functions of Y. 

$(Y) = ^Tr [SPS^] , *(Y) = ^Tr [SEPiS^Pa] , *'(Y) = ^Tr [SSPiY^Ps] . 
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Then, the following estimates hold true : 

1 



Var(*) = O 



t2 



Var(*' 



Var(*) = O 

The proof, based on the Poincare-Nash inequality (60), is omitted. 
In order to use the Integration by parts formula (62), notice that 

cj2S(a2) + 8(^2)5:5]^ = I . 



o 



(71) 



(72) 



Taking the mathematical expectation, we have for each p, g G {1, . . . , r} : 

a2E(Spg) + E [(SSS^)p,] = <5(p - q) . 

A convenient use of the Integration by parts formula allows to express E [(SS5]^)pg] in terms 
of the entries of E(S). To see this, note that 



t r 



1]i 



E [(SSS^jpJ = J] J]E(5p,S,,S 
j=i i=i 

For each i, E(S'pjSjjEqj) can be written as 

E(5'pjSjj5]qj) = E,(Spi)BijBqj + E {SpiYqj^ By + E (^SpiYijTiqj^ . 

Using (62) with function $(Y) = SpiY^qj and (63) with <&(Y) = Spi, and summing over index 
i yields : 

E [{S^j)p^ = ^E{Spq) - d,E [r?(S^,)pS-] - ^E [Spq^f Sh,] + E [(Sb,)p] B- . 



(73) 



Eq. (26) for M = D implies that Var(r/) = 0{t~^), or equivalently that E(r/ ) = 0(t~^). We 

o 

now complete proof of (69). We take Eq. (73) as a starting point, and write t] as t] = E(?/) + ?/ = 
Q + r). Therefore, 



jJP ^Q,J\ 



QE[(S^j.)pS~] +E 



Plugging this relation into (73), and solving w.r.t. E [(S^j)p Sq, jj yields 

E [{Si^)p%-\ 



i 1 + adj 



1 + adj 



E[(Sb,-)p]i?,, 



^^^E[5,,^fSb,l-^L_E 



* 1 + a(i 



'J 



1 + adj 



Writing = hj+Yj, and summing over j provides the following expression of E [(SSS 



Jpq] 



E [(SEE^), 



pq\ 



dq'-Tr 



D(I + aD)"^ E(S) 



+ E 



dqE 



SB(I + aD)-^B 



1. 





-dqE 







1-dH 



Spq-Tr SBD(I + aD)-^B 



5pg-Tr ( SBD(I + aD)-^Y 



E 



?7 S5]D(I + aD)-^I] 



P,Q 



(74) 



The resolvent identity (71) thus implies that 



6ip 



a^E{Spg) + ^Tr [d(I + at))-'] E{Sp,) 



+ E 



-dgE 



SB(I + aD)-^B^) 





-dqE 







1. 



t 



Spq-Tt ( SBD(I + aD)"^B 



Spq^Tr (^SBD(I + aD)"^Y^ 



E 



p,q 



In order to simplify the notations, we define pi and p2 by 



1. 



Pi 



1-dH 



Tr SBD(I + aD)~^B 



and p2 



1 



Tr(SBD(I + aD)"^Y^ 



For i = 1,2, we write K{SpqPi) as 

E{SpqP,) = E{Spg) E{p,) +E{Spq p, 

Thus, (75) can be written as 

6{p-q) 

+ fE(S)B(I + aD)^^B^ 



a^EiSpg) + dg ^Tr [d(I + at))-'] E{Spg) 

1. 



l-aH 



- dgE{Spg)-Ti E(S)BD(I + aD)-^B 

pq t 



dgE{Spg)E 



1, 



-Ti-(SBD(I + aD)"^Y 



dgEl Spg Pl] - dgE{ Spg ^2 



-E 



77 SI]D(I + qD)~^5] 



p,q 



We now establish the following lemma. 
Lemma 3: 



Ep2 



where ps is defined by 



E 



-Tr (SBD(I + aD)-^Y 



1. 



2-dH 



-a-Tr (E(S)BD^(I + aD)^^B 



e(^P°3) 



P3 = -Tr ( SBD^(I + aD)-^S 



Proof: We express E(/92) as 
and evaluate E (^{Shj)iYij^ using formula (63) for $(Y) = (Sbj)j. This gives 



E ((Sb,),l^ = ^E (11^) S,, 



By (64), 



E 



(dSik\ 



E{Su{hfS)k) -E{Sii{yfS 
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Therefore, 

E (yf Sb,) = -J,E (r?bf Sbj) - djE {vyfShj) 
Writing again -q = E(r/) + r] = a + rj, we get that 



E yf Sb, 



Solving this equation w.r.t. E (yj^Sbjj yields 



-djE (^bf Sbj) - djE (vyfShj 



(79) 



E (yf Sb,) = 
or equivalently 



"^^■-^EfbfSb, 



di 



^E(^bfSb,- 



1 + adj 



^Efj^yyfSb,) (80) 



Eq. (77) immediately follows from (78), (81), and the relation E{ri p^) = E{rj p^). ■ 
Plugging (77) into (76) yields 



S{p - + A 



pq 



E(Sj 



/I 1 

a'^ + dq[ -TrD(I + aD)^^ - E(pi) + a-TrE(S)BD2(I + aI))-^B^ 



+ 



l-oH 



E(S)B(I + aD)-^B 



(82) 



pq 



where A is the r x r matrix defined by 



Apq=E 



r? SSD(I + aD)-^S 



pq 



+ dgE ( Spgipl + P2) ) - d.EiSpg) E ( ^ 



for each p, q or equivalently by 



A = E 



^(SSD(I + aD)-i5]^) +E( (pi+p2)S) D-Ef^pg) E(S) D . 



o o 



Using the relation aD(I + aD) ^ = I — (I + aD) ^, we obtain that 

a^Tr (^E(S)BD2(I + aD)"2B^ 
= ^Tt (^E(S)BD(I + aD)-iB^) - ^Tr (^E(S)BD(I + qD)-2b^) 
= E{pi) - jTt (^E(S)BD(I + aD)~2B^ 

Therefore, the term 

iTrD(I + aD)-^ - E(pi) + a^Tr (e(S)BD2(I + atyy^B^ 



(83) 
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is equal to 



^TrD(I + atyy^ - ^Tr (^E(S)BD(I + aliy^B^ 



1 



Tr 



D(I + aD)"i (l - B^E(S)B(I + aD)"^ 



which, in turn, coincides with a"^ f, where f is defined by 



TO" 



-Tr 



t 



D (a^il + at))) (l - E(S((j2)) B(I + at>)-^ 



Eq. (82) is thus equivalent to 



EfS) 



Id// 



cj^(I + fD) + B(I + aD)-^B 



1 + A 



or equivalently to 



E 



l-T)// 



S cr^(I + aD) + B(I + aD)"'B 



I + f72(a-f)E(S)D + A 



or to 



E(S) = R + o-2(q - f )E(S)DR + AR 



(84) 



(85) 



(86) 



We now verify that if M is a deterministic, uniformly bounded matrix for the spectral norm 
as i ^ oo, then t~^TrARM = O (t"^) . For this, we write j Tr ARM as j Tr ARM = 
T1+T2- Ts where 



We denote by the term 



E 



r] iTr (^SSD(I + aD)-iS^RM 
E (^{pi + P2) J Tr(SDRM) 
E^pa) iTr(E(S)DRM) 



t 



Pi = -Tt ( SS;D(I + aD)-^S^RM 



o2. 



and notice that Ti = E(?/ p^). Eq. (26) implies that E(?7 ) and E 



I Tr ( SDRM) 



are 



0{t~^) terms. Moreover, matrix R is uniformly bounded for the spectral norm as t — )• 00 (see 
(67). Lemma 2 immediately shows that for each i = 1,2,3, E(pj ) is a 0(t~^) term. The 
Cauchy-Schwarz inequality eventually provides jTrARM = 0(t~^). 

In order to establish (69), it remains to show that a — f — )• 0. For this, we remark that 
exchanging the roles of matrices S and Xl^ leads to the following relation 



E(S) = R + cj2(a - r)E(SDR) + AR 



where t(cj^) is defined by 



r(a^) = ^Tr 



D (cj2(I + aD)) I - BE(S(cj2))B^(I + aD) 



(87) 



(88) 



36 



and where A, the analogue of A, satisfies 



^Tr(AM) = o(^ 



(89) 



for every matrix M uniformly bounded for the spectral norm. 

Equations (86) and (87) allow to evaluate a and f. More precisely, writing a = jTr(DE(S)) 
and using the expression (87) of E(S), we obtain that 



a = ^Tr(DR) + a^{a - T)^Tr(DE(S)DR) + ^Tr(DAR) . 



(90) 



(91) 



Similarly, replacing E(S) by (86) into the expression (84) of f, we get that 

f = iTr [D(CT2(I + aD)^Hl-B^RB(I + aD)-i 

-{a- f)\Tv [d(I + aD)^iB^E(S)DRB(I + qD)^^ 
-iXr [D(cr2(I + aD)^iB^ARB(I + aD)^^ . 
Using standard algebra, it is easy to check that the first term of the righthandside of (91) 
coincides with iTr(DR). Substracting (91) from (90), we get that 

(a — t)uq + (a — f)vQ = e 



(92) 



where 



no 
e 



CT2iTr(DE(S)DR) 



D(I + aD)-iB^E(S)DRB(I + at>)-^ 
iTr(DAR) + iXr [d(cj2(I + aD)~iB^ARB(I + at) 

Using the properties of A and A, we get that e = 0(t~^). 
Similar calculations allow to evaluate a and r, and to obtain 

(a — t)uq + (a — f)vo = e 

where 



(93) 



1-lTr 



D(I + aD)"iBE(S)DRB^(I + dD)"^ 
a2iTr(I 

and where e = 0{t~^). (94, 92) can be written as 



-2 



a2iTr(DE(S)DR) 



(94) 



(95) 



[ ^^0 


Vo 1 


II 


f a — T 


-1 




V Uo 


Vo j 




[ a-f j 







(96) 

If the determinant uqVo — uovo of the 2x2 matrix governing the system is nonzero, q — r and 
d — f are given by : 

vo€ -vo€ . _ uoe - uoe 



a — T 



a — T 



UqVo - UoVo 



UqVo - UoVo 



(97) 
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As matrices R and E(S) are less than and matrices R and E(S) are less than ^It, it is 
easy to check that uq,vq,uo,vo are uniformly bounded. As e and e are 0(t~^) terms, (a — r) 
and (a — f) will converge to as long as the inverse {uqVq — uqVq)"^ of the determinant is 
uniformly bounded. For the moment, we show this property for large enough. For this, we 
study the behaviour of coefficients uq , uo , ^^o , ^'o for large enough values of cr^. It is easy to 
check that : 



Uq 


> 


^ 0-4 4 "max "max "max 


Vo 


> 


g.4 "max "max "max 


Uq 


< 


<ax 




< 


r <ax 

t 0-2 



(98) 



As ^ — )• c, it is clear that there exists Uq and an integer for which uq > 1/2, Oq > 1/2, < 
1/4, Wo < 1/4 for t > to and cr^ > ctq. Therefore, iio?)o — i^o^^o > ^ for t > to and > cjq. Eq. 
(97) thus implies that if o"^ > cJq, then a — t and d — f are of the same order of magnitude as 
e = 0(t~^), and therefore converge to when t — )• oo. It remains to prove that this convergence 
still holds for < < cJq. For this, we shall rely on Montel's theorem (see e.g. [5]), a tool 
frequently used in the context of large random matrices. It is based on the observation that, 
considered as functions of parameter ct^, a(cj^) — r(cr^) and a{a'^) — f((T^) can be extended to 
holomorphic functions on C — M~ by replacing by a complex number z. Moreover, it can be 
shown that these holomorphic functions are uniformly bounded on each compact subset K of 
C — M^, in the sense that supf sup^gj^ \a{z) — T{z)\ < oo and sup^ sup^g^^ \a{z) — f{z)\ < oo. 
Using Montel's theorem, it can thus be shown that if a(iT^) — t((7^) and d((T^) — f{a^) converge 
towards zero for each a'^ > a^, then for each z G C — M~, a(z) — r(z) and a{z) — f{z) converge 
as well towards 0. This in particular implies that a{a'^) — r(cr^) and d((T^) — f (cr^) converge 
towards for each o"^ > 0. For more details, the reader may e.g. refer to [17]. This completes 
the proof of (69). 

We note that Montel's theorem does not guarantee that a — t and a — f are still 0{t^'^) 
terms for o"^ < cjg. This is one of the purpose of the proof of Step 2 below. 



In order to finish the proof of Proposition 7, it remains to check that (70) holds. We first 
observe that R - T = R (T^^ - R"^) T. Using the expressions of R ^ and T ^, multiplying 
by M, and taking the trace yields : 



^Tr [M (R - T)] 



(/3-d) (j2^Tr(MRDT) + 

{a -13) ^Tr MRB(I + /3D)-^D(I + /3D)-^B^T 



(99) 
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As the terms ^Tr(MRDT) and ^Tr MRB(I + /3D)-^D(I + l3t))-^B"T are uniformly 
bounded, it is sufficient to establish that (a — /3) and (d — /3) converge towards 0. For this, we 
note that (69) implies that 

Q = ^Tr(DR) + e', d = ^Tr (^Dr) + e' , (100) 

where e' and e converge towards 0. We express (q — /3) = jTrD(R — T) + e. Using R — T = 
R (T~^ — R^^) T, multiplying by D from both sides, and taking the trace yields 

(a-/3) (^1-^Tr DRB(I + ^D)-^D(I + /3D)-^B^T j + (d - /3) a2^Tr(DRDT) = e'. 

(101) 

Similarly, we obtain that 

(a-/3) a2^Tr(DRDT) + (d-/3) (^1 - ^Tr DRB^(I + /3D)"1D(I + /3D)-1DT ^ = e'. 

(102) 

Equations (101) and (102) can be interpreted as a linear systems w.r.t. (a — /5) and (d — /5). 
Using the same approach as in the proof of (69), we prove that (a — /3) and (d — /3) converge 
towards 0. This establishes (70) and completes the proof of Proposition (7). ■ 

2) Second step : iTrM(E(S) - R) and iTrM(R - T) are 0{t-^) terms: This section is 
devoted to the proof of the following proposition. 

Proposition 8: For each deterministic r x r matrix M, uniformly bounded (for the spectral 
norm) as t — )• oo, we have : 

jTr [M (E(S) - R)] = 0{t^^) (103) 

^Tr [M (R) - T)] = 0(t-2) (104) 
Proof: We first establish (103). For this, we prove that the inverse of the determinant 
uqVq — uqVq of linear system (96) is uniformly bounded for each > 0. In order to state the 
corresponding result, we define {u,v,u,v) by 



u 



V 



1 - iTr(DTB^(I + /3D)-iD(I + pUy^BT) 
1 - iTr(DTB(I + /3D)-iD(I + (3B)-^B^T) 

21' 



V = cj^iTr(DTDT) 



(105) 



u = cr2lTr(DtDt) 

The expressions of (u, v, u, v) nearly coincide with the expressions of coefficients 
{uq,vo,uo, vq), the only difference being that, in the definition of {u, v, u, v), matrices (E(S), R) 
are both replaced by matrix T, matrices (E(S),R) are both replaced by matrix T and scalars 



39 



(a, a) are replaced by scalars (/3, /?). (69) and (70) immediately imply that {uq,vo,uo,vo) can 
be written as 

uo = u + eu, vo = v + ev, vo = v + ev, uq = u + e« , (106) 

where eu,iv,^u,^v converge to when t — )• oo. The behaviour of uv — uv is provided in the 
following Lemma, whose proof is given in paragraph II-C.3. 

Lemma 4: Coefficients {u,v,u,v) satisfy : (i) u = v, (ii) < n < 1 and infj u > 0, (iii) 
< uv — uv < 1 and sup^ - < oo. 

(106) and Lemma 4 immediately imply that it exists to such that < uqVq — uqVq < 1 for each 
t > to and 

sup — z — — : < oo . (107) 

t>to UqVq - UqVq 

This eventually shows a — t and a — f are of the same order of magnitude than e and e, i.e. 
are 0{t^^) terms. 

In order to prove (104), we first remark that, by (103), e and e defined by (100) are 0(t^^) 
terms. It is thus sufficient to establish that the inverse of the determinant of the linear system 
associated to equations (101) and (102) is uniformly bounded. Eq. (70) implies that the behaviour 
of this determinant is equivalent to the study of uv — uv. Eq. (104) thus follows from Lemma 
4. This completes the proof of Proposition 8. 

■ 

3) Proof of Lemma 4.: In order to estabUsh item (i), we notice that a direct application of 
the matrix inversion Lemma yields : 

tB^(I + /3D)-^ = (I + /3D)-^B^T . (108) 

The equality u = v immediately follows from (108). 

The proofs of (ii) and (iii) are based on the observation that function cj^ — )■ (7^/3 (cr^) is 
increasing while function cr^ — /3((t^) is decreasing. This claim is a consequence of Eq. (16) 
that we recall below : 



/r+ a + 0-2 ' 7r+ A + 0-2 ' 

where ^6(M+) = 7X1(0) and /i6(M+) = Note that /3 is decreasing because ^ 

is decreasing and a'^P^a'^) is increasing because a'^ 1— )• ^^^2 is increasing. Denote by ' the 
differentiation operator w.r.t. cj^ Then, {a'/3)' > and p < for each cr^. We now differentiate 
relations (15) w.r.t. a'. After some algebra, we obtain : 

v{a'f3)' +a'v^' = lTr(DTB(I + /3D)-i(I + /3D)-iB^T) ^^^^^ 
^{a'P)'+u^' = -iXrTDT 
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As p' < 0, the first equation of (109) implies that v (cr^/?)' > 0. As (cr^/3)' > 0, this yields 
t; > 0. As {; < 1 clearly holds, the first part of (ii) is proved. 

We now prove that infj i) > 0. The first equation of (109) yields : 



V > — 



1 



(a2/3)' 



(110) 



In the following, we show that inft j^^i^ > 0, infj |/5 | > and that infj v > 0. 
By representation (16), 



-/3 



and (cT2/3(a2))' 



Aii/Xb(A) 

(A + (72)2 



IS 



/r+ (A + a2)2 

As ^j^piyi < ^ for A > 0, (0-2/3)' < J5.^f,(]R+) = iTrD. Therefore, the term 
lowerbounded by cj2(iTrD)-^ As ^TrD < fdmax, we have inft j^i^ > 0. 

We now establish that inft |/? | > 0. We first use Jensen's inequality : As measure 
(jTrD)"^ rf/if)(A) is a probability distribution : 



In other words, |/3'| = {x+a2)2 dflbi^) satisfies 



l/5|>TT 



iXrD 



-TrD 

t 



/32. 



-1 



djlh{\) . 



iTrD L^R+ A + 0-2 

As mentioned above, (jTrD)^^ is lower-bounded by (dmax)^- Therefore, it remains to 
establish that inf^ /?2 > 0, or equivalently that infj (3 > 0. For this, we assume that inf^ /3t{(7^) = 
(we indicate that /3 depends both on a'^ and t). Therefore, there exists an increasing sequence 
of integers {tk)k>o for which linife^oo A, (cr^) = i.e. limfe_^oo /r+ ^/"i*'°''(A) = , 

where is the positive measure associated with Ptki^"^)- As D is uniformly bounded, the 
sequence (/u[*'''*)fc>o is tight. One can therefore extract from {fib''^)k>o ^ subsequence (Ab*''')i>o 
that converges weakly to a certain measure jll which of course satisfies 

1 



L 



R+ A + CJ^ 



d/i^(A) = 



This implies that fll = 0, and thus /z^(R+) = 0, while the convergence of (/i[*''')i>o gives 



lim fl^/'\R^ 



lim ^TrDj' > 



by assumption (3). Therefore, the assumption inft/?t(cj2) = leads to a contradiction. Thus, 
inft > and inft 1/5' I > is proved. 

We finally establish that v is lower-bounded, i.e. that inft jTrDTDT > 0. For any Hermitian 
positive matrix M, 



TrfM^) > 



1 2 



-Tr(M) 
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We use this inequality for M = T^SbT^/^. This leads to 

1 1 9 

-TrDTDT = -TrM^ > 
t t 

Therefore, inft ^TrDTDT > inft /J^. Using the same approach as above, we can prove that 
infj > 0. Proof of (ii) is completed. 



-TV(M) 



-Tr(DT) 

t ^ ' 



In order to establish (iii), we use the first equation of (109) to express in terms of p , 

and plug this relation into the second equation of (109). This gives : 



1 



u 1 



u--uv] /3' = — TrTDT - ^-Tr(DTB(I + /3D)-i(I + /3D)-1B^T) . (Ill) 



t a'^v t 

The righthand side of (111) is negative as well as p . Therefore, u — ^uv > 0. As v is positive, 
uv — uv is also positive. Moreover, u et v are strictly less than 1. As ti and v are both strictly 
positive, uv — uv is strictly less than 1. To complete the proof of (iii), we notice that by (111), 



1 



< 



uv - uv {iiXrTDT 

1/3' I clearly satisfies \p\ < ^jTrD and is thus upper bounded by (ii) impHes that 

supi i < +00. It remains to verify that infj jTrTDT > 0. Denote by x = jTrTDT. 

t t 



X 



i=i j=i 

In order to use Jensen's inequality, we consider Ri = t^^^ and notice that j X]i=i = 1. x 
can be written as 



t f ^ 



1=1 



By Jensen's inequality 
if - 

Moreover, 



> 



i=i j=i 



i=\ j=i 



- 2 




2 


> 


. 1=1 





-Ttd) ,5 



Finally, 



TrTDT > ( ^TrDj /S^ . 



Since inf^ > 0, we have inf^ jTrTDT > and the proof of (iii) is completed. 
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Appendix III 
Strict concavity of 7(Q) : Remaining proofs 

A. Proof of Lemma 1 

Remark that (pm is strictly concave due to (45). Remark also that (j) is concave as a pointwise 
limit of the (pm's- Now in order to prove the strict concavity of (p, assume that there exists a 
subinterval, say (a, b) C [0, 1] with a < b where fails to be strictly concave : 

VA G [0, 1], ^(Aa + (1 - X)b) = A0(a) + (1 - A)^(6) . 

Otherwise stated, 

VxG(a,6), -^{x) = ^^^-^^{x-a)+^{a). 

b — a 

Let X G (a, 6) and h > he small enough so that x — h and x + h belong to (a, 6) ; recall the 
following inequality, valid for differentiable concave functions : 

(pmix) - (pmix - ^) ^ i,/ ^ w ^rnix + h) - 4>mix) 
— 'Pm\X) ^ 



h - ^""^ ' - h 

Letting m — oo, we obtain : 



-^yy^y^j y,fyyy^j 

n m^oo m->oo h 

In particular, for all x G (a, b), limm-5.00 (p'mix) = '^^^^1^''°^ • Now let [x,x + h] G (a, b). Fatou's 
lemma together with (45) yield : 

px+h 

< Kh < / lim inf (/)^^ (u) dti 

J X 

rx+h 

< liminf / (j)'l^{u)du = lim ((^^(x + /i) — (^'^(x)) = 0. 

m-^co J ^ m— >oo 

This yields a contradiction, therefore must be strictly convex on [0, 1]. 

B. Proof of (46). 

We define M as the tm x tm matrix given by 



M = ( I + 

We have : 



HQH^\ ^ H 



ct2 y a2 



C(A) = --KTr [M(Qi - Q2)M(Qi - Q2) 

m 



or equivalently 

HQH^\ 'H 



C(A) = --ETr 
m 



I + ) ^(Qi - Q2)M(Qi - Q2)H^ 
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Recall that Tr(AB) > Amm(A)Tr(B) for A, B Hermitian and nonnegative matrices. In 
particular : 



Tr 



1 + 



HQH 



H 



H 



(Qi-Q2)M(Qi-Q2)H 



H 



> K 



1 + 



HQH 



H 



Tr 



H 



(Qi-Q2)M(Qi-Q2)H 



H 



Similarly, we obtain that 



Tr 



^^(Qi - Q2)M(Qi - Q2)H 



H 



> Xr 



1 + 



HQH 



H 



Tr 



^(Qi-Q2)^^^(Qi-Q2)H 



H 



This eventually implies that 

^(Qi-Q2)M(Qi-Q2)H^ 

"1 



Tr 



1 + 



HQH^ 



> 



1 + 



HQH 



H 



Tr 



H^H - ^ "H^H - , 

(Q1-Q2)— :^(Qi-Q2) 



As 



we have : 



1 + 



HQH^ 



> 



HQH" 



> 



1 



[1 + (T-^||Q|| ||H^H 



,2 ' 



C(A) < E 

m 



(1 



IQII l|H^H| 



X Tr I ^(Qi - Q2) ^(Qi - Q2) 



Let us introduce the following notations 
1 



OLr. 



, 2 ' 



/3„ 



1 



-Tr 



H^H,-. - "H^H - - / 

^^(Qi - Q2)^^(Qi - Q2) 



(l + a-2||Q|| ||H^^H||)^ 
The following properties whose proofs are postponed to Appendix III-C hold true : 

Proposition 9: (i) limm-s>oo var(/3m) = , 



> 



(ii) For all m > 1, E(/3„) = E(/3i) = ETr [^{Qi - Q2)^y^(Qi - Q2 

(iii) There exists 6 > such that for all A G [0, 1], liminfm-5.00 E(am) > 6 > . 
We are now in position to establish (46). By Proposition 9-(i), we have 



m^mM - E{amMM\ < Vvar(/3™)VE(a,l) < Vvar(/3^) > . 

By Proposition 9-(ii),(iii), we have : 

liminf E(am/3m) = liminf E(am)E(^„) = E(/3i) liminf E(am) > (5E(/3i) > . 

m— >oo m— >oo m— >oo 

The bound (46) is now established for k = —6E{(3i). Applying Lemma 1 to (j)m{X), we conclude 
that A i-> (^(A) is strictly concave for every Qi, Q2 in Ci (Qi 7^ Q2), and so is Q i-> -^(Q) by 
Proposition 2. 
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C. Proof of Proposition 9 

Proof: [Proof of (i)] In order to prove that liiiim var(/3m) = 0, we shall rely on Poincare- 
Nash inequality. We shall use the following decomposition^ : 



~ 1 ~ ~ 1 



UDtU^. 



In particular, H writes 



U^HU 



K 



U^AU + D 



1 U^WU ~ 1 



K + 1 " ' 
= B + D^^Di = B + Y 

A 



w hcic X is a r X t matrix with i.i.d. SA^(0, 1) entries. Consider now the following matrices : 

B = B, r = D, t = lm®t>, V = I„ U, V = ® tJ. 
Similarly, H writes : 



V^HV = B + r2^Lf 2 = B + Y 



A 



where X is a mr x mt matrix with i.i.d. CA^(0, 1) entries. Denote by = U^(Qi — 
Q2)U and by = V^(Qi - Q2)V(= (g) 0). The quantity writes then : /3m = 
^Tr0E^S0E^S. Considering f3m as a function of the entries of X = (Xij), i.e. 
= Standard computations yield 



Poincare-Nash inequality yields then 



m 



var 



mt ^-^ 

mt ^ — ' 



50(X) 



dXi. 



< 



< 



mt 

4c?rna,xC?! 



E 



max "max 



m 



3*3 



m'^t'^ 

Moreover, Schwarz inequality yields 



4dn,axdma. ||0^^0||E ( — TVS^SOE^SO^S^S 



mt 



1 

mt 



1 

mt 



1/2 



mt 



1/2 



3. Note that the notations introduced hereafter slightly differ from those introduced in Section III-B but this 
should not disturb the reader. 
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SO that 



—Trt^tG^t^te < 110^01 

mt 



mt ^ ^ 



H V2 r 



mt ^ ^ 



1/2 



Schwarz inequality yields then 



—Tit^te^t^-Ee] < ||0^0| 

mt ) 



— Tr(S^S)2 

mt 



.1/2 



mt 



.1/2 



It is tedious, but straightforward, to check that 



supE ( — Tr(I]^S)2 ) < +00 

mt 



and 



supE 



1 

mt 



Tr(I]^S)* ) < +00 



which, in turn, imply that var(/3m) = 0{^). 
Proof: [Proof of (ii)] Write E as 
1 



a^m 



1 



ETr (B^B + B^Y + Y^B + Y^Y) 0(B^B + B^Y + Y^B + Y^Y) 



a^m 



(a) 



TrB^B0B^B0 + 



a^m 



Hi 



1 



ETrB^B0Y^Y0 



cr^m 



+ 



1 



a^m 



ETrB^Y0Y^B0 + 



1 



a^m 



ETrY^B0B^Y0 



+^— ETrY^Y0B^B0 + ^— ETr Y^Y0Y^Y0 

a^m a^m 



where (a) follows from the fact that the terms where Y appears one or three times are readily 
zero, and so are the terms like ETrB^Y0B^Y0. Therefore, it remains to compute the 
following four terms : 





A 


— TrB^B0B^B0 , 




m 




A 


— ETrB^B0Y^Y0 






m 




A 


— ETrB^Y0Y^B0 






m 




A 


— ETrY^Y0Y^Y0 






m 



Due to the block nature of the matrices involved, Ti = TrB B0B B0 ; in particular, Ti 
does not depend on m. Let us now compute T2. We have T2 = m~^TrB^B0E (Y^Y) 
and E (Y^Y) = (mt)-^f ^E (XTX) f ^ = (mt)-iTr(r)f . Therefore, T2 writes : 

T2 = ^Tr (r) -^Tr (^B^B0f ©) = Tr (D) ^Tr (^B^B0D0) , 
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and this quantity does not depend on m. We now turn to tlie term T3. We have 
T3 = m-^TiB^E (Y0Y^) BG). The same computations as before yield E (Y0Y^) = 
(mt)"iTr (fsGfa) r. Therefore T3 writes : 



Ts = — Tr (f 20f^) — Tr (B^rB©) = Tr (D20Di] ixr (B^DB©) , 



m \ J mt 

which does not depend on m. It remains to compute = ^Tr [E (Y^YGY^Y) ©]. 
E(Y^Y0Y^Y) = -^-^f^E (^Xrxf 20f iXrx) ^ 

Computing the individual terms of matrix E ^XFXf 2 ©r^XFX^ yields (denote by G = 
f iQfi for the sake of simplicity) : 

[e (xrxGxrx)]^,^ = ^ e ^Xjj^fcXj^jjXi^j^Xj^,^^ rjj^ijGjij^ria.ia 

«i.ii.i2,«2 

= (Trr)2Gfe, + Tr(r2) TrG , 
where 5^1 stands for the Kronecker symbol (i.e. Ski = 1 if /c = £, and otherwise). This yields 

E (Y^ Y0Y^ Y) = — ^ (Tr Vf f ©f + — i-^Tr (r^) Tr ( &t) f 

^ ' [mty [mty V / 



1 /Trr\^ 1„ /-^-.\ 1 1„ ,„o^ 1 /„ ^^\2 



and 

^4 = 4f— V-Tr(r0r0)+-i-TY(r^) — (Trer) 

= ^ (TrD)^ TV (^D0D0) + ^Tr (D^) (^Tr0D)^ , 

which does not depend on m. This shows that E/3m does not depend on m, and thus coincides 
with E/3i. In order to complete the proof of (ii), it remains to verify that E/3i > 0, or equivalenlty 
that E/3i is not equal to 0. If E/3i was indeed equal to 0, then, matrix 

or equivalently matrix 

H^H(Qi - Q2) 

would be equal to zero almost everywhere. As Qi 7^ Q2, it would exist a deterministic non 
zero vector x such that x^H^Hx = almost everywhere, i.e. Hx = 0, or equivalently 

WC^/^x = - Ax . (1 12) 

As matrix C^/^ is positive definite, vector C^/^x is non zero. Relation (112) leads to a 
contradiction because the joint distribution of the entries of W is absolutely continuous. This 
shows that E/3i > 0. The proof of (ii) is complete. ■ 
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Proof: [Proof of (iii)] In order to control 



|Q||. Now ||H^H|| = ||Hf and 
IIHII < 



(l+a-2||Q||||H«H||) 



r, first notice that ||Q| 



K 



K + l 



|A|| + 



1 



w 



'mt 



Now IIAI 



IC2 II and II A2 1 



IC2 1|. The behaviour of the spectral norm of 



1 + \f\Jc almost 



(mt)~2 W is well-known (see for instance [36], [1]) : W 
surely. Therefore, Fatou's lemma yields the desired result : lim inf ^ Ectm > (5 > , and (iii) is 
proved. ■ 



Appendix IV 
Proof of Proposition 5, item (i). 

By (50) and (51), (k,k,Q) ^ y(K,K,Q) is differentiable from M+ x R+ x Ci to M. In 
order to prove that /(Q) = y((5(Q), 5(Q), Q) is differentiable, it is sufficient to prove the 
differentiability of 5, 5 : Ci — )■ M. Recall that 5 and 5 are solution of system (33) associated 
with matrix Q. In order to apply the implicit function theorem, which will immediatly yield 
the differentiablity of b and b with respect to Q, we must check that : 

1) The function 

' 5-/(5,^,Q) 

^ 5-/(5,5,Q) 

is differentiable. 

2) The partial jacobian 



(<5,5,Q)^T(<5,5,Q) 



I)(5 5)T(5,<5,Q) 



%{b:bM) -S(5,5,Q) 



i(^,5,Q) 



a<5 



is invertible for every Q € Ci. 
In order to check the differentiability of T, recall the following matrix equality 

(I + uv)-iu = U(I + VU)^^ 



(113) 



which follows from elementary matrix manipulations (cf. [20, Section 0.7.4]). Applying this 
equality to U = Q2 and V = bCQ,~^, we obtain : 

-1 , / ~ \-i 



AQ5 I + 5QiCQ^ 



QtA^ = AQ(I + 5CQ 



H 



which yields 



/(<5,5,Q) = -Tr<!C 



b 



K + l 



C + 



K 



K + l 



AQ It + 



K + l 



CQ 



H 
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Clearly, / is differentiable with respect to the three variables 6, 6 and Q. Similar computations 
yield 



-1 



and the same conclusion holds for /. Therefore, {6, 6, Q) i— )■ T(5, 6, Q) is differentiable and 1) 
is proved. 

In order to study the jacobian D^^^^T, let us compute first 



-1 _ 1 a. _ 1 



1 1 / J 1 ~ Q2CQ: 

^ Tr (DTB(I + /3D)-^D(I + /3D)-^B^T) 



where (a) follows from the virtual channel equivalences (31), (32) together with (39) and (41). 
Finally, we end up with the following : 

1 - ^(<5, 5, Q) = 1 - -Tr(DTB(I + /3D)-^D(I + /3D)-^B^T) . 
00 t 

Similar computations yield 

1 - ^(5, 5, Q) = 1 - -Tr(DTB^(I + f3B)-^B{I + /3D)-1BT) , 
86 t 

-^{5,6,Q) = — TV (DTDT) , 

-^{6,6, Q) = y'n^(DtDt). 

The invertibility of the jacobian D^^ ^^T follows then from Lemma 4 in Appendix II-C and 2) is 
proved. In particular, we can assert that Si 9 Q 1— > ^(Q) ^^d Ci 9 Q 1— ^ HQ) differentiable 
due to the Implicit function theorem. Item (i) is proved. 



Appendix V 
Proof of Proposition 6 

First note that the sequence (Qfc) belongs to the compact set Ci. Therefore, in order to 
show that the sequence converges, it is sufficient to estabhsh that the limits of all convergent 
subsequences coincide. We thus consider a convergent subsequence extracted from (Qfc)A;>o, 
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say (Q0(fc))fc>O' where for each k, ip{k) is an integer, and denote by Qi its Umit. If we prove 
that 

<V/(Q^),Q-Q^ ><0 (114) 

for each Q G Ci, Proposition 5-(ii) will imply that coincides with the argmax of I 
over Ci- This will prove that the limit of every convergent subsequence converges towards Q^, 
which in turn will show that the whole sequence (Qfc)fc>o converges to Q^. 

In order to prove (114), consider the iteration ijjik) of the algorithm. The matrix Q^(fc) 
maximizes the function Q i— )• Q). As this function is strictly concave andd 

differentiable. Proposition 4 implies that 

< VQF((5^(fc),(5^(fc), Q^(fc)), Q - Q^(A;) > < (115) 

for every Q G Ci (recall that Vq represents the derivative of V{k, k, Q) with respect to V's 
third component). We now consider the pair of solutions {S^(k)+i/^ip{k)+i) of the system (33) 
associated with matrix Q^(fc). 

Due to the continuity of 6{Q) and 5(Q), the convergence of the subsequence Q^(fc) implies 
the convergence of the subsequences ((^^(A;)+i, ^^(fc)+i) towards a limit {6f,5t). The pair 
{6t,St) is the solution of system (33) associated with i.e. df = d{Qt) and Sf = S{Qt) ; 
in particular : 

BV 8V 

(see for instance (56)). Using the same computation as in the proof of Proposition 5, we obtain 

(V/(Q^), Q - Q,^) = {VV (6f,6t, Qt),Q- Qt) (1 16) 



for every Q € Ci. Now condition (57) implies that the subsequence ((^^(fc), ^tp{k)) also converges 
toward {Sf ,6t)- As a consequence. 

Inequality (115) thus implies that (yV{5t , Sf, Q* ), Q — Q^) < and relation (116) allows us 
to conclude the proof. 

Appendix VI 
End of proof of Proposition 3 

Proof of Proposition 3 relies on properties of estabUshed in Proposition 5-(iii). Denote 

by 

yl = max ( sup II A||, sup ||C||, sup ||C|| ) < oo and a = min ( inf Amin(C), inf Amin(C) ) > 



t t t 



t t 
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Proof of (i): Recall that by Proposition 5-(iii), maximizes logdet(I + QG((5,,, 5=,,)). 
This implies that the eigenvalues (Aj(Q^)) are the solutions of the waterfilling equation 

^ ,0 



Vj = l,...,t, Aj(QJ = max 7 



A,(G) 



where 7 is tuned in such a way that Aj(Q^) = t. It is clear from this equation that ||Q^ || < 7. 
If 7 < Amin(G)-i then ||QJ| < Knn{Gr\ If 7 > K.in{G)-^ then 7 > \j{G)-^ and we 
have : 



hence 



In both cases, we have 



It remains to prove 



7 



3 j 

1^1 1 
1 + 7Z^A;(G) -^+A„,i„(G) 



IQ*II < 1 + 



1 



Amin(G) 



VQgCi, infA,,in G(,5(Q),5(Q)) >0 



(117) 



(118) 



and we are done. To this end, we first show that inf^ 6{Q) > for all Q G Ci. From Equations 
(40) and (42), we have : 



^(Q) 



> Amm(C)jtrTx(a2) 

(a) 

^ Ajxiin 



C) 
K 



-tr ( a^Ir + 



K + l 



K + 1 
6 

K + l 



6C 



(b) 

^ Amin( 



C) ( jtr ( a% + 



-AQA' 



(119) 



K + l K + l 

where (a) follows from Jensen's InequaUty and (6) is due to the facts that + Y)^^|| < 1 

and tr(XY) < ||X||tr(Y) when Y is a nonnegative matrix. We now find an upper bound for 

5. From (41) and (13), we have ||Ti^(o-2)|| < l/cj^. Using (42) we then have 

1 I A 

S < ||T/^||-trCQ < ||Ti^||||C||-trQ < 

t t a'' 

(recall that jtrQ = 1). Getting back to (119), we easily obtain 



-tr a%. + 



a 



K+l 



8C + 



K 



K + l 



AQA 



< 



a' + 



A 



A^K 



K + lJ^KT-l^^^ V(t,.),-^c 
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where Co is a certain constant term. Hence we have S{Q) > qCq ^. By inspecting the expression 
(50) of G{6, 6), we then obtain 



A. 



.(G) > 



K + l ~ ' - K + l 
and (118) is proven. It remains to plug this estimate into (117) and (i) is proved. 

Proof of (ii): We begin by restricting the maximization of /(Q) to the set Sf = {Q : 
Q = diag(gi, ...,%) > 0, tr(Q) = t} of the diagonal matrices within Ci, and show that Q!^ = 
argmaxQggd /(Q) satisfies sup^ ||Q*|| < oo where the bound is a function of {a,A,a'^,c,K) 
only. The set Qf is clearly convex and the solution Q!^ is given by the Lagrange Karush-Kuhn- 
Tucker (KKT) conditions 

5/(Q) 



dqj 



^Ep(Q)]=r?-/3,- 



(120) 



where J(Q) = logdet (l^ + ^HQH^) and the Lagrange multipliers i] and the /3j are 
associated with the power constraint and with the positivity constraints respectively. More 
specifically, r/ is the unique real positive number for which X]j=i Qj — ^' ^^'^ f^j satisfy 
= if > and /3j > if qj = 0. We have 



dm) 

dqj 



1 



a 



2^ 



H 



Ir + ^HQH 



H 



As hj is a 



where hj the j**^ column of H. By consequence, ¥,[d3{Q)/dqj] < [||hj|p 
Gaussian vector, the righthand side of this inequality is defined and therefore, by the Dominated 
Convergence Theorem, we can exchange d/dqj with E in Equation (120) and write 



dljQ) 
dqj 



1 



rE 



0"^ 



h 



H 



Ir + ^HQH 



H 



(121) 



Let us denote by Hj the r x [t — 1) matrix that remains after extracting hj from H. Similarly, 
we denote by Qj the (t — 1) x (t — 1) diagonal matrix that remains after deleting row and 
column j from Q. Writing Rj = ^1^ + ^HjQjH|^^ , we have by the Matrix Inversion 
Lemma ([20, §0.7.4]) 



1 



I, + ^HQH^ 



-RjhjhfKj 



a2 + qjhfKjh, 

By plugging this expression into the righthand side of Equation (121), the Lagrange-KKT 
conditions become 



E 



a2 + q^Xj 



V - A 



(122) 



where Xj = h^Hjhj. A consequence of this last equation is that qj < 1/rj for every j. Indeed, 
assume that qj > l/rj for some j. Then o"^ + qjXj > Xjf-q hence E 



^5-, — V- < V, therefore 
f3j > (122), which impUes that qj = 0, a contradiction. As a result, in order to prove that 
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sup( HQ* II < oo, it will be enough to prove that sup^ l/rj < oo. To this end, we shall prove 
that there exists a constant C > such that 



max P (Xi < C) > . 

j=l,...,t ■' t-^oo 



Indeed, let us admit (123) temporarily. We have 



E 



c 



ct2 + qjC 



E 



X, 



„ ^ nxj > c) - „ ^ ^ 

0-2 + q^C ^ ■' ' 0-2 + q^C 



C 



> 



+ E 



X, 



(j2 + qjXj 



(123) 



where e 



is increasing. As 



^{Xj < C')> and the inequality is due to the fact that the function f{x) 



C 



max \ej\ < —77 max F(Xj < C) - 
j=i,...,t ■' j=i,...,f t^oo 



> 



by (123), we have 



liminf min ( E 



X. 



c 



0-2 + qjC 



> . 



_a^ + qjXj_ 

Getting back to the Lagrange KKT condition (122) we therefore have for t large enough r]—/3j > 
every j = 1, . . . ,t. By consequence, 

1 

— <r ■ 

C 



1 

- < 



2a2 



rj r]- 

for large t. Summing over j and taking into account the power constraint qj = t, we obtain 



^<^ + t, i.e. i < ^ + 1 and 



2a' 



supllQ^II < ^ + 1 



(124) 

which is the desired result. To prove (123), we make use of MMSE estimation theory. Recall 
+ --^^==-^C^/2wci/2 Denoting by and zj the j*^ columns of the 



that H 



matrices A and WC^/2 respectively, we have 

1 



Xi 



K 



af + 



1 



K 



1 



1 



K + l ' ./WTlVt 
We decompose Zj as = Uj + uj- where Uj is the conditional expectation Uj = 
E [zj ||zi, . . . , Zj_i, Zj+i, . . . , zt], in other words, Uj is the MMSE estimate of Zj drawn from 
the other columns of WC^/2_ -pnt 



(125) 
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Then 




(126) 



Let us study the asymptotic behaviour of 5^. First, we note that due to the fact that 
the joint distribution of the elements of WC^/^ is the Gaussian distribution, uj- and 
Vj = [z^, . . . , zj_^, zj^]^, . . . , zj^]^ are independent. By consequence, uj- and (Rj, Uj) are 
independent. Let us derive the expression of the covariance matrix Ru = E[u^uj-^]. 
From the well known formulas for MMSE estimation ([35]), we have Ru = E[zjz|^] — 
E[zjv|^] f^E[vjv|^]^ E[vjzj^]. To obtain Ru, we note that the covariance matrix of the 



vector z = [zj , zj]^ is E[zz^] = & (g)Ir Gust check that E [W&/%[W&/^], 
6{i — k)[C]ij). Let us denote by Cj, Cj and Cj the scalar Cj = [C]jj, the j*'^' vector column of C 
without element cj, and the (t — 1) x (t — 1) matrix that remains after extracting row and column 
j from C respectively. With these notations we have Ru = (^cj — Cj^Cj^Cj^ I,.. Recalling that 
uj- and (Rj, uj) are independent, one may see that the first term of the righthand side of (125) 
is negligible while the second is close to pj = j— — — ^tr(RjC). More rigorously, using 
this independence in addition to ^ = max(||A||, ||C||, ||C||) < oo and ||Rj|| < 1, we can prove 
with the help of [1, Lemma 2.7] or by direct calculation that there exists a constant Ci such 
that 



E 



< ^ • (127) 



In order to prove (123), we will prove that the pj are bounded away from zero in some sense. 
First, we have 



-1 (b) 

> ||C-i|ri = A^i„(C) >a 

jj 



(for (a) see [20, §0.7.3] and for (6), use the fact that |[X]fc;| < ||X|| for any element {k,l) of 
a matrix X). By consequence, 



> l^(^^(ll-ll-ll-ll-ll^ll-ll^w|,) = iMQ))-' 

where (a) is Jensen Inequality and (6) is due to tr(XY) < ||X||tr(Y) when Y is a nonnegative 
matrix. As linit ||--i=W|| = 1 + \/l/c with probability one ([1]), and furthermore, tr(Q) = t, 
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we have with probability one 

liminf ^.min^^p, > ^ (c^ + ^ (2 + c'V^ ) ' = . (12 

Choose the constant C in the lefthand side of (123) as C = C2/4. From (126) we have 

maxP(Xj < C) < maxP(5'j < C) 
3 3 

= maxP (5j < C, |5j - Pj| > C) + maxP (^j < C, \Sj - Pj\ < C) 
3 3 

< maxP {\Sj - pj\>C) + maxP (pj < 2C) 

3 3 



M 1 

< — max E 
(b) 1 

< max E 
= 0(1) 



{Sj - pjf + maxP (pj < 2C) 



(Sj - pjf + P (^minpj < 2cj 



where (a) is Tchebychev's Inequality, (6) is due to maxjP(£j) < P(Uj£j), and (c) is due to 
(127) and to (128). 

We have proven (123) and hence that Q!^ = argmaxqggd /(Q) satisfies sup^ ||Q* II < co- 
in order to prove that Q* = argmaxQgCi -^(Q) satisfies sup^ ||Q*|| < 00, we begin by noticing 
that 

^ " " (129) 



max /(Q) = max max E 



logdet ( + ^HUAU^H^ 



where lit is the group of unitary txt matrices. For a given matrix U G U(, the inner maximiza- 
tion in (129) is equivalent to the problem of maximizing the mutual information over when 



the channel matrix H is replaced with H' = HU = ^ t^A' + -^^-^cV2w'C'V2. Here, 
matrix C' is defined by C' = U^CU, A' = AU, W' = W0 where is the unitary 
matrix = CVSuC'-^/^. As U G Ut, we clearly have ||A'|| = ||A||, ||C'|| = ||C||, 
and ||C'^^|| = ||C^^||. By consequence, the bounds a and A, and hence the constant C in 
the left hand member of (123) (which depends only on {a, A,a'^ ,c,K)) remain unchanged 
when we replace H with H'. By consequence, for every U G the matrix A*(U) that 
maximizes E [logdet (l^ + ^HUAU^H^)] satisfies ||A=,(U)|| < 2a'^/C + 1 (see (124)) 
which is independent of U. Hence ||Q*|| < jC + 1 which terminates the proof of (ii). 
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